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Abstract 



McNamara and Dall (2011) identified novel relationships 



between the abundance of a species in different environ- 
ments, the temporal properties of environmental change, 
and selection for or against dispersal. Here, the mathemat- 
ics underlying these relationships in their two-environment 
model are investigated for arbitrary numbers of environ- 
ments. The effect they described is quantified as the 
fitness-abundance covariance. The phase in the life cycle 
where the population is censused is crucial for the impli- 
cations of the fitness-abundance covariance. These rela- 
tionships are shown to connect to the population genetics 
literature on the Reduction Principle for the evolution of 
genetic systems and migration. Conditions that produce 
selection for increased unconditional dispersal are found to 
be new instances of departures from reduction described 
by the "Principle of Partial Control" proposed for the evo- 
lution of modifier genes. According to this principle, vari- 
ation that only partially controls the processes that trans- 
form the transmitted information of organisms may be se- 
lected to increase these processes. Mathematical methods 
of Karlin, Friedland, and Eisner, Johnson, and Neumann, 
are central in generalizing the analysis^] Analysis of the 
adaptive landscape of the model shows that the evolution 
of conditional dispersal is very sensitive to the spectrum of 
genetic variation the population is capable of producing, 
and suggests that empirical study of particular species will 
require an evaluation of its variational properties. 

1 Introduction 

In analyzing a model of a population that disperses in 
a patchy environment subject to random environmen- 
tal change, McNamara and Dall (2011) describe "how 



an underappreciated evolutionary process, which we term 
'The Multiplier Effect', can limit the evolutionary value 
of responding adaptively to environmental cues, and thus 
favour the evolutionary persistence of otherwise paradox- 
ical unconditional strategies." By "multiplier effect" , Mc- 
Namara and Dall mean, 



1 Dedicated to the memory of Professor Michael Neumann, one of 
whose many elegant theorems provides for a result presented here. 



If a genotype is distributed in space and its 
fitness varies with location, then selection will 
change the spatial distribution of the genotype 
through its effect on population demography. 
This process can accumulate genotype members 
in locations to which they are well suited. This 
accumulation by selection is the multiplier effect. 

It is possible, they discover, for the 'multiplier effect' to 
reverse — for there to be an excess of the population in 
the worst habitats — when there is very rapid environ- 
mental change. The environmental change they model 
is a Markov process where are large number of patches 
switch independently between two environments that pro- 
duce different growth rates for a population of organ- 
isms. They find that for moderate rates of environmental 
change, populations will have higher asymptotic growth 
rates if they reduce their rate of unconditional disper- 
sal between patches, which produces effective selection for 
lower dispersal. 

Their key finding is that the reversal of the 'multiplier 
effect' due to rapid environmental change corresponds ex- 
actly with a reversal in the direction in which dispersal 
evolves: when abundance is greater on better habitats, 
lower dispersal is selected for; when abundance is greater 
on worse habitats because the environment changes so 
fast, there is selection for higher dispersal. 

|McNamara and Dall| conclude their paper saying, "the 
multiplier effect may underpin the evolution and mainte- 
nance of unconditional strategies in many biological sys- 
tems." This is indeed the case. Their results are in fact 
part of the phenomenon already known as the "reduction 
principle" , which was first described as such in models for 



the evolution of linkage ( Feldman 1972 1 , and subsequently 



extended to models for the evolution of mutation rates, 
gene conversion, dispersal, sexual reproduction, and even 



cultural transmission of traditionalism (Altenberg 19841 



The reduction principle also underlies other phenomena: 
the 'error catastrophe' in quasispecies dynamics, and the 
effect of population subdivision on the maintenance of ge- 
netic diversity. 

The Reduction Principle can be stated, in a rather gen- 
eral form, as the widely exhibited phenomenon that mixing 
reduces growth, and differential growth selects for reduced 
mixing. 
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While the reduction phenomenon studied in McNamara 



and Dall (2011) is not a new concept, three particular 



aspects of their study are novel: 

1. their discovery of conditions that cause mixing to in- 
crease growth — which addresses the open problem 
posed in jAltenberg (2004 Open Question 3.1) as to 
the conditions that produce departures from the re- 
duction principle; 

2. that these departures from reduction emerge from 
very rapidly changing environments; and 

3. that these departures from reduction correspond to 
reversals in the association between fitness and abun- 
dance in different environments. 



|McNamara and Dall| produce these results from a two- 
environment model. A principal goal here is to generalize 
each of these findings to arbitrary numbers of environ- 
ments. Insight on how to generalize them is provided by 
clues in their results. Some of these clues point to the 
main tool used to achieve the generalization, a theorem of 
the late Sam Karlin, to be described. 

The property described by McNamara and Dall as 'the 
multiplier effect' is here made mathematically precise, as 
a positive covariance between fitness and the excess of 
the stationary distribution of the population above what 
it would be in the absence of differential growth rates, 
as censused just after dispersal. I refer to this quantity 
as the fitness-abundance covariance, which is a bit more 
descriptive and specific than the term 'multiplier effect', 
which already has long use as a concept in economics. 

A critical aspect to use of the fitness-abundance covari- 
ance is the phase in the life cycle at which the census is 
taken. When McNamara and Dall say that "individuals 



are likely to find themselves in circumstances to which 
they are well-adapted," it matters where in its life cycle 
the individual finds itself — whether it is on its natal site 
or has already dispersed. [McNam ara and Dall| do not ex- 
plicitly address the phase at which they take their census, 
but their model shows it to be just after dispersal, before 
reproduction. 

The issue of census phase is explicitly addressed here, 
and is shown to critically affect properties of the fitness- 
abundance covariance. For populations censused just after 
dispersal, one cannot say in general that "individuals are 
already likely to be on the better site." As a consequence 
of this phase dependence, a novel result found here is that 
by taking a census of the populations before and after 
reproduction, one can in certain situations infer a bound 
on the duration of changing environments. 



A result in McNamara and Dall (2011) that garnered 



considerable attention is that " 'stupid strategies' could 



be best for the genes" (University of Exeter 2011 1: 



One underappreciated consequence of the multi- 
plier effect is that because individuals tend to 



be in locations to which they are well suited, 
its mere existence informs an organism that it 
is liable to be in favourable circumstances. This 
information can outweigh environmental cues to 
the contrary, so that an individual should place 
more weight on the fact it exists than on any ad- 



ditional cues of location quality. [McNamara and 



Dall (2011) 



The general analysis provided here produces results that 
seems to contradict the above: philopatry is never an evo- 
lutionarily stable strategy when there is any level of envi- 
ronmental change; it can always be invaded by organisms 
that disperse from the correct environments. 

In an attempt to resolve the apparent contradiction, I 
take a closer examination of the adaptive landscape — the 
gradient of fitness over the space of conditional dispersal 
probabilities. What is found is that the evolutionarily sta- 
ble state is highly sensitive to constraints on the organis- 
mal variability for dispersal probabilities. Slight changes 
in the constraints can shift the evolutionarily stable state 
from complete philopatry to complete dispersal from some 
environments. This sensitivity means that conditional dis- 
persal may be a highly volatile trait evolutionarily. More- 
over, to understand the evolution of any particular species 
requires an analysis of the constraints on the phenotype, 
and the probabilities of generating heritable variation in 
any phenotypic direction — in short, an evolvability anal- 



ysis (Wagner and Altenberg 1996) 



While it is relatively straightforward to determine the 
long-term growth rates of different dispersal phenotypes, 
determining the likelihood that such phenotypes will be 
produced by the population plunges one into issues of 
the organism's perceptual and cognitive limits, ecologi- 
cal correlates, and the genotype-phenotype map, and re- 
quires specific empirical knowledge of the organism and 
its variability in order to address. This is perhaps why, 
p. 494) insightfully writes 



as 



Lcvinton ( 1988 



Evolution- 
ary biologists have been mainly concerned with the fate of 
variability in populations, not the generation of variabil- 
ity. . . . Whatever the reason, the time has come to reem- 
phasize the study of the origin of variation." A principle 
finding here is that the evolutionary outcome is not de- 
termined by the adaptive landscape studied here, and we 
are pointed instead to examine the variational properties 
of each particular species in question. 

1.1 The Reduction Principle and Fisher's 
Fundamental Theorem of Natural Selec- 
tion 

The intuition as to why there should be selection for lower 
dispersal in a population at a stationary balance between 
dispersal and selection is well expressed in the following 
explanation: 
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Even in the absence of genetic variability for local 
adaptation in a spatially heterogeneous environ- 
ment, migration will be selected against because 
on the average an individual will disperse to an 
environment worse than the one it was born in, 
since better environments harbor more individu- 



als. (Olivieri et al. 19951. 



This is a description of populations that have equilibrated 
to a balance between dispersal and differential growth. 
Fisher's Fundamental Theorem is that differential growth 
rates increase the mean fitness of the population by an 
amount equal to the variance in the growth rates. When 
the population is at a stationary distribution, however, 
this requires that dispersal decrease the mean fitness by 
exactly the same amount. Fisher uses the phrase "dete- 



rioration of the environment" (Fisher 1958 discussed in 



Price 1972) to describe this exact counterbalance to the 



variance in fitness that increases the mean fitness. But he 
includes mutation in this concept: 

... an equilibrium must be established in which 
the rate of elimination is equal to the rate of mu- 
tation. To put the matter in another way we may 
say that each mutation of this kind is allowed to 
contribute exactly as much to the genetic vari- 
ance of fitness in the species as will provide a 
rate of improvement equivalent to the rate of de- 
terioration caused by the continual occurrence of 
the mutation. ( Fisher| 1958 p. 41) 



Fisher was thinking of mutation, not dispersal, in the 
above. But as we shall see later, the same mathemat- 
ics underlies both. Like a the mutation/selection bal- 
ance Fisher describes, dispersal will generally be to lesser 
quality environments when the population has reached a 
growth/dispersal balance. 

The interchangeability of many results in population 
genetics between mutation and dispersal reflects the fact 
that an organism's location, like its genotype, is transmis- 
sible information about its state, and its degree of preser- 
vation during transmission is itself an organismal pheno- 



type and subject to evolution ( Cavalli-Sforza and Feldman 



[W3l[Karrin and McGregor|1974[|Altenberg|1984| pp . 15 
16, p. 178 |Schauber et al.||2007[|Odling-Smee||2007| ). The 

issue of the faithfulness of transmission brings us to the 
reduction principle. 

2 A Review of the Reduction Princi- 
ple 



McNam ara and Dall| are more correct than perhaps even 
they realized in noting that their subject is an "under- 
appreciated evolutionary process" . It is clear that aware- 
ness of the body of population genetics literature on the 



reduction principle has not fully percolated between dis- 
ciplines. Karlin's ( 19821) key theorem on the reduction 



phenomenon, and its application to the evolution of dis- 



persal Altenberg (1984), were independently duplicated 



recently by Kirkland et al. (20061. And McNamara and 
DaII| pOll] ) were evidently unaware of the paper by |Kirk^ 
land et al. (2006), published in a mathematics journal. 



One main goal of this paper, therefore, is to provide 
a 'portal' to the reduction principle, its historical devel- 
opment, and methods of analysis for a broader audience. 



Here, I tie-in the work of McNamara and Dall (2011) to 



the larger stream of work on the reduction principle, and 
show that their work contributes toward answering one of 
the main open problems in the field: how departures from 
the reduction phenomenon are produced. 

It may be appropriate to apologize for the density of 
equations in this paper, as equations nowadays are often 
being relegated to online-only supplements. But the sub- 
ject of this paper is in fact mathematical methodology. It 
is the mathematics that creates a single conceptual and an- 
alytical framework for dispersal, recombination, mutation, 
random environments, and multiple genetic processes. To 
show how they all share in a single body of results requires 
we delve into the mathematics. 

It should be noted that many theoretical studies con- 
strain their analysis to models having only n = 2 patches 
or genotypes, to allow explicit calculation of the eigen- 



values and eigenvectors (e.g. McNamara and Dall 2011 
Steinmeyer and Wilke 20091. There are mathematical 



tools from the reduction principle literature, however - 
in particular the aforementioned theorems of Karlin - 
that make analytical results tractable for arbitrary n. Dis- 
semination of these tools to a larger audience is another 
principal goal of this paper. They are laid out in Methods. 

2.1 Development of the Reduction Principle 

In the first analyses of genetic modifiers of mutation, re- 
combination, and migration by Marc Feldman and cowork- 
ers in the 1970s, a common result kept appearing, which 
was that reduced levels of mutation, recombination, or mi- 
gration would evolve when populations were near equilib- 
rium under a balance between the forces of selection and 
transmission. The earliest appearance of the reduction 



phenomenon in the literature is perhaps Fisher's ( 1930 
p. 130) assertion that "the presence of pairs of factors in 
the same chromosome, the selective advantage of each of 
which reverses that of the other, will always tend to dimin- 
ish recombination, and therefore to increase the intensity 
of linkage in the chromosomes of that species." This claim 



was mathematically verified by Kimura ( 1956 ). Nei ( 1967 



1969) posed the first three- locus model for the evolution 



of recombination, with a modifier locus controlling the re- 
combination between two loci under selection, and found 
that only reduced recombination would evolve. The first 
fixed-point stability analysis of modifiers of recombination 
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between two loci under viability selection was by Fcldman 
(1972), who found that recombination would be reduced 
by evolution. Subsequent studies extended the reduction 
result to larger and larger spaces of models, including mod- 
ifiers of: 



dispersal: Karlin and McGregor ( 1972 ) ; Balkau and Feld- 



man (1973); Karlin and McGregor (1974); Teague 



Wiener and Feldman (1991 1993); 



(1977); Asmussen (1983); Hastings (1983); Feldman 



and Liberman ( 1986 1; Liberman and Feldman ( 1989 ); 



mbination: Feldman 


( 


1972 


I; |Karlin and McGregor 


(1972 


); Feldman and Balkau|(1972 


1973); Feldman 


and Krakauer (1976|); 


Feldman et al 


(19801; Liber- 


man and Feldman 


(1986a 


; Feldman and Liberman 


(1986 


); and 



mutation: 


Karlin and McGregor ( 


1972): 


Liberman and 


Feldman 


( 1986b 


); Feldman and Liberman 


(1986 


)■ 



(Note that this literature prefers the term 'migration', 
while 'dispersal' is preferred in the ecology literature. Lit- 
erature searches need to include both.) 

These studies also extended the generality of the re- 
duction results to include arbitrary large modified rates, 
arbitrary viability selection regimes, and multiple modifier 
alleles. They could only analyze the case of two patches or 
two alleles per selected locus, however, due to their use of 
closed-form solutions for the determinants or eigenvalues. 



Hastings ( 1983 1 is notable in extending the phenomenon 



to continuous spatial variation. 



Feldman ( 1972 ) proposed that the essential direction of 
evolution for the recombination modifiers was reduction 
in the recombination rates. Shortly thereafter, |Karlin and| 
McGregor (1972 1974) proposed an alternative idea, that 



the underlying governor for the direction of modifier evolu- 
tion was the "Mean Fitness Principle" . The Mean Fitness 
Principle proposed that a modifier allele increases when 
rare if and only if it changes the parameter it controls 
to a value that would increase the mean fitness of the 
population at equilibrium. Both reduction and mean fit- 
ness principles explained the known results at that time. 



However, Karlin and Carmelli (1975 Fig. 1) found an ex- 



ample where reducing recombination would decrease the 



mean fitness of the population, while Feldman et al. ( 1980 1 
showed that, even for this example, an allele reducing re- 
combination would grow in the population. Therefore, 
only the reduction principle remained unfalsified. Subse- 
quent modifier gene studies have found other counterex 



amples to the mean fitness principle ( 


Uyenoyama and 


Waller 


1991a|b 


Wiener and Feldman 


1993 


). In 


Feldman 


et al. 


( 


1980 


) is where reduction was first referred to as a 



"principle" . 



2.2 Karlin's Theorems 

During the time period of these developments, Karlin had, 
ironically, elucidated the mathematical foundations for the 
reduction principle himself — without realizing it. 

Karlin was investigating a seemingly distant topic - 
how population subdivision would affect the maintenance 
of genetic variation. To understand how the protection of 
alleles against extinction depended on migration patterns 
and rates, Karlin ( 1976 , 1982 1 developed two general theo- 



rems on the spectral radius of perturbations of migration- 
selection systems. The spectral radius is the growth rate 
for the whole group of genotypes that comprise the pertur- 
bation as they approach a stationary distribution among 
themselves. 

These theorems show how, for two different kinds of 
variation in migration, a greater level of 'mixing' reduces 
the spectral radius of the stability matrix for the system, 
and thus may cause some alleles to lose their protection 
against extinction. Hence, greater levels of mixing would 
lead to fewer polymorphic alleles. Preparatory to this 



work was the paper by Friedland and Karlin (|1975|. The 

pp. 



theorems first appear, without proof, in Karlin ( 1976 
642-647), and with proof as Theorems 5.1 and 5.2 in Kar- 
lin ( |1982 ), restated as follows: 



Theorem 1 ( |Karlin| [1982) Theorem 5.1, pp. 114-116, 
197-198). Consider a family of stochastic matrices that 
commute and are symmetrizable to positive definite ma- 
trices: 



T := {M h = LS/jR: M ;i M fe = M k M h }, 



(1) 



where L and R are positive diagonal matrices, and each 
S/j is a positive definite symmetric real matrix. Let D be 
a positive diagonal matrix. Then for each M/,,,Mft G T, 
the spectral radius, p, satisfies: 

p(M h M_ k T>) < p(M fc D). 



Theorem 2 ( |Karlin| [1982] Theorem 5.2, pp. 117-118, 
194-196). Let M be a non-negative irreducible stochastic 
matrix. Consider the family of matrices 

M(a) = (l-a)I + aM. 

Then for any positive diagonal matrix D 7 the spectral ra- 
dius 

p(a) = p(M(a)D) 
is decreasing as a increases (strictly provided D ^ dl). 

In Theorem 5.1, 'more mixing' is produced the applica- 
tion of a second mixing operator; in Theorem 5.2, more 
mixing is produced by the equal scalar multiplication of 
all the transition probabilities between states. In both 
cases, greater mixing reduces the spectral radius, which 
represents the asymptotic growth rate of a rare allele in 
Karlin's analysis. 
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Theorems 5.1 and 5.2 display certain tradeoffs in gener- 
ality. In Theorem 5.2, M may be any irreducible stochas- 
tic matrix, but the variation in the matrix family is re- 
stricted to a single parameter — the scaling of the transi- 
tion probabilities. In Theorem 5.1 on the other hand, the 
variation in the matrix family is more general in that it 
has up to n — 1 degrees of freedom to vary (see Remark 



for Lemma 22 1, but the matrix class itself is narrower with 



the constraint that they be symmetrizable. 

Karlin's proof of Theorem 5.2 relied upon the re- 
cently minted variational formula for the spectral radius 
of Donsker and Varadhan ( 1975 ) . These results on the 
reduction principle, and their means of generalization, all 
came into being in the same time period. 



3 Application of Karlin's theorems to 
the Evolution of Dispersal and Ge- 
netic Systems 

My own contribution to the reduction principle began with 
a conjecture by Marcus Feldman (1980, personal commu- 
nication). The existence of polymorphisms for genes con- 
trolling recombination and mutation rates had been dis- 
covered theoretically by Feldman and Balkau (1973) and 



Feldman and Krakauer (19761). Generalizing from these 



examples, Feldman conjectured that whenever a parame- 
ter controlled by a gene enters linearly into the recursion 
on the frequency dynamics, then a polymorphism for that 
gene would exist in which: 

1. the population, when fixed on an allele producing a 
particular value of the linear parameter, is at an equi- 
librium; 

2. each allele's average value of the parameter is equal 
to that particular value; and 

3. the gene is in linkage equilibrium with the rest of the 
genome. 

Because condition[2j was analogous to the condition for al- 
leles under viability selection that their marginal fitnesses 
be equal at equilibrium, these polymorphisms were called 
'viability-analogous, Hardy- Weinberg' (VAHW) modifier 
polymorphisms . 

The repeated appearance of the VAHW polymorphisms, 
and the repeated occurrence of the reduction principle in 
models of different phenomena (recombination, mutation, 
and dispersal) prompted me to investigate the possible 
unification of these phenomena, which is provided in |A1 



tenberg (1984) 



It turns out that the only way a parameter can enter 
linearly in the recursion is if it modifies transmission prob- 
abilities rather than fitnesses. The approach to unification 



was to represent all of the models in one general expres- 
sion, in which the specifics of the transmission probabili- 
ties P(i «— j, k) (parents j and k produce offspring i) are 
ignored, while the variation produced by the modifier lo- 
cus is made explicit. 

The form of variation studied was where the modifier 
gene produced an equal scaling, to, of all transmission 
probabilities between states, i.e. mP(i<—j, k), when j =/= i 
or k 7^ i. The principle models that exhibited the re- 
duction principle all incorporated this form of variation. 
Equal scaling of transmission probabilities occurs when a 
single transformative event acts on the transmitted infor- 
mation, and the modifier gene controls the rate of this 



event (Altenberg 2011 1 . 



With this explicit representation of variation, the mod- 
els that had exhibited the reduction principle had stability 
matrices of the form M(to)D for newly introduced modi- 
fier alleles, where M(to) = (1 — m)I + toP as in Karlin's 
theorem. Once this structure is made evident, application 
of Karlin's Theorem 5.2 immediately yields the result that 
the growth rate of a new modifier allele was a decreasing 
function of to, so if it reduced to below the current value 
in the population, it would invade, and if it increased to 
above the current level, it would go extinct. 

Thus evolution would reduce the rates of all of these 
various processes, or others that had never been modeled 
before but which were covered by the general formulation. 
Prior studies needed to assume only two alleles under se- 
lection, or two patches subdividing the population, be- 
cause they relied on closed-form solutions to determinants 
or eigenvalues. Karlin's theorem allowed the result to be 
generalized to arbitrary numbers of alleles and patches, ar- 
bitrary patterns of transformation, and arbitrary selection 
regimes. 

It should be noted that modifiers of segregation dis- 
tortion have altogether different dynamics that merit a 

-178). 



separate classification (Altenberg, 1984 



pp. 



170- 



Slight variation among different models led to separate 
treatments for modifiers of mutation and recombination 
106-169, Altenberg and Feldman 



(Altenberg 1984 



pp. 



1987), modifiers of dispersal (Altenberg 1984 pp. 77-81, 



178-199), modifiers of rates of asexual vs. sexual repro- 
duction (ibid. pp. 199-203), and culturally transmitted 
modifiers of cultural transmission — i.e. 'traditionalism' 
(ibid. pp. 203-206). All of these phenotypes manifest the 
reduction principle for the same underlying reason, the 
spectral radius property shown in Karlin's Theorem 5.2. 

3.1 The Dispersal Modifier Results of Al- 
tenberg (1984) 



The results on the evolution of dispersal modifiers in [AT 
tenberg| (|1984[ pp. 77-81, 178-199) will be briefly re- 



viewed, so that the work need not be duplicated, as has 



recently occurred (Kirkland et al. 2006) 
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The model is of an organism that has a multiple-stage 
life cycle, consisting of random mating, semelparous re- 
production, selection on gametes, zygotes, and adults, and 
lastly, dispersal. The reproductive output of an organism 
depends on its patch and its diploid genotype for a gene 
under selection. The probability of dispersing between 
any two patches is scaled by a modifier gene. The model 
includes several generalizations of prior work: 

• arbitrary numbers of patches; 

• arbitrary dispersal patterns between patches, which 
may include cycles and asymmetry; 

• dispersal of either adults or gametes (but not disper- 
sal of zygotes, which breaks the Hardy- Weinberg fre- 
quencies of diploids and complicates the analysis); 

• arbitrary hard or soft selection patterns on diploids 
and gametes; 

• arbitrary numbers of alleles at a dispersal-modifying 
locus; and 

• arbitrary number of alleles for the locus with patch- 
specific fitnesses. 

Analysis is made of the evolutionary stability of popu- 
lations near equilibrium. In order for any new modifier al- 
lele to grow or decline at a geometric rate, the equilibrium 
must possess variation in the reproductive rates among 
patches and/or genotypes. This variation was first identi- 
fied as a property of equilibrium populations at mutation- 



selection balance by Haldane|(1937), and was later called 



the 'genetic load' by Muller (11950). 



The term "fitness load" was used in Altenberg ( 1984 1 to 



generalize the genetic load concept to circumstances where 
there may be no genes involved — in particular, to patches 
with different growth rates where the stationary distribu- 
tion leaves some patches as sinks and others as sources, 
as they were later to be called (Pulliam 1988). The term 
'selection potential', V := maxi(Di)/mean(Dj) — 1, was 



adopted in Altenberg and Feldman ( 1987 ) because of the 



analogy to potentials in physical systems, and because 
V was the actual maximum potential selective advantage 
that a modifier allele could accrue. V > is necessary for 
any geometric growth in the modifier allele. The condi- 
tion V — corresponds to a population at an 'ideal free 



distribution' (Fretwell and Lucas 1969 Fretwell 1972). 



For the dispersal modifier model in Altenberg (1984), 



a positive selection potential requires some differences at 
equilibrium among the terms 



N s (e) w(e,i) 



(2) 



A D (e) w(e) ' 

over the environments e, and genotypes i, where 

N s (e) is the population size in environment e after selec- 
tion, and N B (e) after dispersal, 



w(e) is the mean fitness in environment e, 

w(e, i) is the mean fitness of the allele i under selection in 
environment e, 

N s (e) = N(e) w(e) under hard selection, and A^ s (e) is 
constant under soft selection. 

One can see the two sources for a selection potential in 
|2|: ecological, i.e. variation in N s (e)/N I) (e) (mentioned 
in the earlier quote of Olivieri et al.|[l995 ), and genetic, 
i.e. variation in w(e,i)/w(e). 

Ideal free distributions having V = may be produced 
by the "balanced mixture polymorphisms" discussed in 
Altenberg| (p84l pp. 101-104, 129, 189-190, 218-222), 



which are synonymous with the Nash equilibria studied in 



Schreiber and Li (2011) 



The main results obtained are the manifestation of the 
Reduction Principle for dispersal rates. First, we have this 
result for modifier allele with extreme effect: 



Result. 3.27, Altenberg (1984, P- 195) A modifier al- 



lele which stops all migration will always increase when 
introduced to a population with an equilibrium selection 
potential, for any linkage to the locus under selection. 

For modifier alleles with intermediate effects on disper- 
sal, tractability requires the assumption of tight linkage 
between the modifier locus and the selected locus. Under 
tight linkage, the stability matrix for the new modifier al- 
lele becomes a direct sum of blocks for each allele i under 
selection: 

edt + l) =D 1 [(l-m)I + mM]D 2 (i) e z (t) (3) 

where M is the matrix of average dispersal probabilities 
produced by modifier alleles in the equilibrium population, 
and 



Di = diag 



1 



N D (e) 



,D 2 (i)= diag 



J e=l 



N s (e)w(e,i) 



w(e) 



J e=l 



where is the number of patches. Then the following is 
obtained: 



Result. 3.28, Altenberg (1984, P- 199) 



1. The new modifier allele, a, can change frequency at 
a geometric rate, that is, /?(M a DiD2(i)) 7^ 1, only if 
there is an equilibrium selection potential in the pop- 
ulation, so that DiDa(i) 7^ I. 

2. The spectral radius for the new modifier allele, a, de- 
pends only on how its marginal migration matrix M a 
is related to the equilibrium marginal migration ma- 
trix M. The results of Theorem 3.14 for linear vari- 
ation . . . therefore apply directly: 
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Theorem. 3.14, \Altenberg] {1984\ p. 137): For a tightly 



linked modifier locus, when a new modifier allele, a, is 
introduced to a population at a stable viability- analogous, 
tensor product equilibrium (VAHW), where there is a vari- 
ance in the marginal fitnesses of the selected types present, 
then for m as defined in ([3|, the new modifier allele fre- 
quency will increase if m < 1, and it will be excluded if 
m > 1. 

Theorem 3.14 derives directly from Karlin's Theorem 
5.2, which shows in addition that asymptotic growth 
rate of the new modifier allele increases as m decreases 
throughout the range of m. 

Karlin's Theorem 5.2, and the dispersal modifier results 
above, have recently been duplicated by Kirkland e t al. 
(2006 Theorem 3.1). They use a novel, structure-based 
proof for their version of Theorem 5.2, while Karlin used 
the |Donsker and Varadhan| formula for the spectral ra- 
dius. They apply it to the evolution of unconditional dis- 



persal, and prove a special case of Altenberg ( 1984| Result 
3.28 and Theorem 3.14) where the genetics and life history 
stages are absent. Their results are extended to continuous 
time models by Schreiber and Lloyd-Smith ( |2009[ online 



Appendix B), while Altenberg (2010) uses the Donsker 



land Varadhanl formula to extend Theorem 5.2 to the con- 
tinuous time case. 



The results in Kirkland et al. ( 2006 ) , while being special 



cases of Altenberg ( 1984 1 as far as the genetics are con- 
cerned, offer generalizations of the reduction principle in 
other new directions, namely, they generalize the work on 
density-dependent population regulation first addressed 



for dispersal modifiers by Asmussenj (|1983j) , and cover the 
general case where growth rates decrease with population 
size (Kirkland et al. 2006 Assumptions A1-A3). They 
also cover the case of reducible dispersal matrices (The- 
orem 4.4), the case of lossy dispersal (Assumption A4), 
and the fate of the modifier allele far from perturbation 
(Theorem 3.3). They examine conditional dispersers, and 
analyze the evolutionarily stable state in which dispersal 
has been conditioned to the point where the population 
reaches an ideal free distribution. 

It should be noted that the ideal free distribution was 
proposed as ultimate evolutionarily stable state by Kimura 
( |1967[ ) in his 'principle of minimum genetic load'. Kimura 
was thinking about the evolution of mutation rates, not 
dispersal. But the driving force in each case — the genetic 
load for mutation, and the presence of sink and source pop- 
ulations for dispersal (Pulliam 1988 1 — is mathematically 
the same phenomenon. 



4 Departures from Reduction 

While the reduction phenomenon occurs throughout a di- 
verse class of evolutionary models, there are two principal 
classes in which departures from reduction are found. The 



first class, which will not be further addressed here, com- 
prises situations in which the population is continually 
kept far from equilibrium, due to genetic drift (e.g. the 



Hill- Robertson effect (Barton and Otto 


2005| Roze and 


Barton 2006 Keightley and Otto 2006 


1, also Gillespie 


(|1981al), varying selection regimes (Charlesworth 


1976 


Gillespie, 1981b Ishii et al. 1989 Sasaki and Iwasa 


1987 


Bergman and Feldman 1990 Wiener and Tuljapurkar 


1994 


Schreiber and Li[ 2011| Blanquart and Gandon 


2010 


), or flux of beneficial mutations (Eshel 1973a|b 


Kessler and Levine 1998). 



The second class comprises cases of populations near 
equilibrium where multiple transformation processes act 
on the transmissible information of the organism. Stud- 
ies of multiple transformation processes where departures 
from reduction arc found include the evolution of: 



recombination in the presence of mutation (Feldman 



et al.[ 


1980 


Charlesworth 


1997 


Pylkov et al. 


1998 


1. 



The greatest attention has 
been given to this combination. The departures from 
the reduction result in this case are the basis of the 
'deterministic mutation hypothesis' for the evolution 



of sex (Kondrashov 1982 1984 Kouyos et al. 20071 



recombination in the presence of dispersal 



( Charlesworth and Charlesworth 


et al. 


1998 


); 



• multiple mutation processes (Altenberg, 1984 pp. 
137-151); 

• recombination in the presence of segregation and syn- 
gamy (which self-fertilization exposes in the recur- 



sion) (Charlesworth et al. 1979 Holsinger and Feld- 
man 1983a); 



• mutation in the presence of segregation and syn- 
gamy (exposed in the recursion by self-fertilization 



Holsinger and Feldman 


1983b 


), or fertility selection 


Holsinger et al. 


1986 


Twomey and Feldman 


1990 


))• 



It is notable that in their studies of dispersal in the pres- 



ence of mutation, Wiener and Feldman ( 1991 1993 ) found 



no departures from the reduction principle. 

The pattern of departures from the reduction principle 
caused by multiple transformation processes was summa- 
rized in |Altenberg| ( |1984| pp. 149, 225-228) by a simple 
heuristic: 

The principle of partial control: When the modifier 
gene has only partial control over the transformations 
occurring at loci under selection, then it may be possi- 
ble for the part which it controls to evolve an increase 
in rates. 

In several cases where multiple transformation processes 
produce departures from reduction, the stability matrix 
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on the modifier gene has the form 
M(m) = (1 - m) A 



^B. 



(4) 



Matrices of the form Q also appear when the modifier 



gene is not tightly linked to the loci under selection ( Feld- 



man 


1972 


man 


1987 



IAltenbergl|1984[ p. 
[Altenbergf|2009b| . 



135, |Altenberg and Feld 
Karlin's Theorem 5.2 does 



not apply to such matrices, leaving an entire class of mod- 
els as an unsolved open problem. In a survey of open 
problems in the spectral analysis of evolutionary dynam- 
ics (Altenberg, 2004), the following problem was posed: 



Open Question (3.1 in Altenberg||2004 
be irreducible stochastic matrices 
positive diagonal matrix. Define 



Let A and B 
and let D ^ cl be a 



MO) = (1 -fi)A + fiB. 



(5) 



For what conditions on A, B, and D is the spectral radius 
p(M(/i)D) strictly decreasing in fj,, for < fi < 1, or 



d/U 



p(M(p)T>) < 0? 



This open problem brings us back to the paper by Mc- 
Namara and Dalll (|2011|). 



5 The Model of McNamara and Dall 
(2011) 



The model of |McNamara and Dall| ( |2011| eq. (D.10)) is 
an example of 'partial control' (|4j), where we set A = P 
and B = 7r e T , 7r being the stationary distribution for 
stochastic matrix P, i.e. Ptt = tt. A major point of in- 
terest is that McNamara and Dall find conditions on P 
that produce departures from the reduction phenomenon, 
providing another example of the principle of partial con- 
trol, and contributing towards answering the open prob- 
lem posed in Altenberg (2004), above. The recursion for 
their model is 



where 



z(i + l) = M(m)Dz(i), 



M(to) := (1 



(0) 



(7) 



The control exerted by m over the transformations oc- 
curring in the system in ([7| is only partial because the 
environment itself undergoes transformation, represented 
by P, and the organism cannot eliminate P, but only shift 
between P and tt e T . 

The [McNama ra and Dall| model represents the follow- 
ing. Let Zi (t) be the number of individuals in environment 
i at time t, and Zj(t + 1) be the number after one iteration 
of reproduction and dispersal. 



1. An individual is born into a site with environment 
type i; 

2. The individual reproduces on the site, and produces 
an average of Di offspring when in environment i; 

3. Each offspring disperses independently with probabil- 
ity to to a random site; 

4. In one generation, sites of environment type j change 
randomly and independently to type i with probabil- 
ity M i:j ] 

5. The sites have settled down to a stationary distribu- 
tion, so the probability that the site will be in envi- 
ronment state i is TTi. 

Recursion ^ in summation form is: 

Zi(t + 1) = (1 - to) S2 I',, I), ■;,•!) + mTT l V" DjZj(t). 



McNamara and Dall (2011) obtain analytical results for 



the case of n = 2 types of environment: 



Zl(t+1)' 

z 2 (t + l) 



and 
M = (1-to) 



MD 



Zl (t) 
z 2 (t) 



, where D = 



D 1 
D 2 



1 — P 2 l P\2 
P21 1 — P21 



7Tl 7Tl 
1 — 7Ti 1 — 7Ti 



(8) 



The model is notable for how it represents environmen- 
tal randomness. The common way to model randomly 
changing environments would be to let z(t) represent the 
population size in each patch, M represent the disper- 
sal between patches, and let the matrix of environment- 
specific growth rates, D, be a random or time-dependent 
variable on each patch (e.g. Karlin|[T982 pp. 90-92, 103- 
104, 140-145), yielding a system 

z(i) =MD (t) MD (M) . . . MD (2) MD (1) z(0). (9) 

The analysis of such models can be challenging, requiring a 



resort to approximations and numerical analysis (see Gille- 



spie 



1981b| |Tuljapurkar][T990l |Wiener and Tuljapurkar 
19941). Progress is being made in this area, however, for 



example the analysis of a two-cycle model for the evolution 



of dispersal by Schreiber and Li (2011 1. 



When the random process of changing environments 
is independent among all the patches, as the number of 
patches becomes large, the system becomes deterministic 
in the same way that the Wright-Fisher model becomes de- 
terministic for large populations. This allows one to stop 
keeping track of each patch, and just keep track of the 
number of individuals in each environment type, which is 
what McNamara and Dall do in ([6]). This tremendously 



simplifies the analysis. 
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5.1 Clues to the Generalization of the Re- 
sults 

The original motivation for this paper was to generalize 



the results of McNamara and Dall (2011) to an arbitrary 



number of environments, and to gain insight into why 
their model produces departures from the reduction phe- 
nomenon. Their results reveal four clues needed to solve 
this generalization: 



1. The harmonic mean: McNamara and Dall (2011 ) find 



that departures from the reduction phenomenon are 
determined by the critical condition r-f 1 + t^ 1 < 1, 
where Tj is the expected duration of environment 
i. This expression is part of the harmonic mean, 
2/(T-f 1 +T2 1 ). Could the harmonic mean of figure 
into a generalization of their results? 

2. The limiting distribution: The two matrices in ([7]), 
P, and 7re T , are not arbitrary, but have the relation 
7re T = lim^ooP*. This means, notably, that they 
commute: P(7re T ) = (7re T )P = 7re T , and thus sat- 
isfy one key condition of Karlin's Theorem 5.1. 

3. The second eigenvalue: The terms r^ -1 and t^ 1 derive 
from the probabilities in P: Tj~ = 1 — Pn and = 
1 — P22. The condition r-f 1 + t^ 1 < 1 translates to 
1 < P\\ + p22- Is it a coincidence that Pn + P22 
appears in the second eigenvalue of P, Aa(P) = 1 — 
Pn — P22? Because we see that the critical condition 
becomes A2(P) < 0, which is precisely when P no 
longer meets the condition of Karlin's Theorem 5.1 
that it be symmetrizable to a positive definite matrix. 
By extrapolation, if all the eigenvalues of P besides 
1 are negative, could this be a general condition for 
departures from reduction? 

4. Symmetrizability: Since clues 2. and 3. show the 



relationship between the results of McNamara and 



Dall an Karlin's Theorem 5.2, and we note that ir- 



reducible 2x2 matrices are always symmetrizable, 
might we want to retain symmetrizability in P as we 
try to generalize the results tonxn matrices? 

By following the last clue and constraining P to be sym- 
metrizable as in ([i]) , we shall find it tractable to generalize 



the results of McNamara and Dall ( 2011 ), and we shall see 



that the conjectures prompted by the first and third clues 
are true. 

Symmetrizable stochastic matrices are equivalent to the 
transition matrices of ergodic reversible Markov chains 



(Altenberg 2011 Lemma 2). A Markov chain is reversible 
when the probability of cycles in one direction equals the 



probability of cycles in the opposite direction (Ross 1983 
Theorem 4.7.1, p. 127). In nature, directional cycles of 
environmental change may be more the rule than the ex- 
ception, however. Whether cyclical environments would 
produce different results remains an open question. 



We can step beyond the McNamara and Dall model and 
obtain a more general theorem for departures from reduc- 
tion for the form M(m) = P[(l-m)I + mQ], where P and 
Q satisfy ([!]). This is provided in Theorem 16 in Results. 
The theorem in Altenberg ( |2009a 2011 ) that generalizes 
the reduction principle to the evolution of mutation rates 
among multiple loci turns out to be a special case of Theo- 
This again illustrates the fact that genetic, spatial, 



16 



rem 

cultural, and other transmissible information all belong to 
a single mathematical framework, and that results from 
one domain can often translate easily into results in other 
domains. 

6 Results 



McNamara and Dall (2011) describe their concept of a 



"multiplier effect" without ever giving it a precise math- 
ematical definition. But it is clear from their usage in 
McNamara and Dall ( 2011| online Appendix A, Theorem 



A) that what they are thinking about can be summarized 
as the covariance between 1) the growth rates in each en- 
vironment, and 2) the excess abundance of the population 
in that environment over what it would be without differ- 
ential growth rates. This is defined explicitly below as the 
fitness-abundance covariance. 

When organisms are semelparous, and generations are 
discrete and non-overlapping, there are two phases in the 
life cycle that one can census the population: before and 
after dispersal, or equivalently, after and before reproduc- 
tion. Thus, the fitness-abundance covariance must be de- 
fined for both census phases. 

The fitness-abundance covariance is an object of interest 
in its o wn right. Section |6.1| v entures beyond the specifics 
of the McNamara and Dall model and explores various 
properties of the fitness-abundance covariance for the com- 
pletely general case of z(t + 1) = MDz(i), where M is a 
stochastic matrix representing any process of change be- 
tween states, and D represents the state-specific growth 
rates. The generality of results in Section 6.1 not only 



includes the [McNama ra and Dall| model as a special case, 
but goes beyond models of dispersal since M can just as 
well represent a mutation matrix between genotypes whose 
fitnesses are Di . The results can also apply to a rare geno- 
type in a sexual population where MD represents the lin- 
ear stability matrix on its growth. 

Section |6.3| returns to the specific model of McNamara 
|and Dall] with the chief goal of generalizing the results to 
any number of environmental states. Here is where we 
pursue the clues described in the previous section. 

For clarity, terminology and conventions are provided in 
Table □ 
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6.1 The Fitness-Abundance Covariance 

A precise definition needs to be given for the degree to 
which "individuals tend to be in locations to which they 
are well suited." While it may sound reasonable that an 
organism's "mere existence informs an organism that it is 



liable to be in favourable circumstances" (jMcNamara and 
P 



Dall 2011 



237), this is not generally true. 
The following is an example where an organism is more 
likely to find itself in a sink habitat than a source habitat. 
The situation is where there is a small source patch within 
a large habitat of sink patches. We get a simple result if 
we assume the extremes: that sink habitats are lethal, 
and dispersing organisms recruit with fixed probabilities 
7Tj to each patch, i, and the dispersal rate is m. Then V\ — 
1— m(l— 7Ti) is the stationary proportion of the population 
in the source patch after dispersal. The stationary portion 
of the population in the source patch, v\, can be made as 
small as one wishes by large dispersal rate to, and small 

7Tl. 

Clearly, before dispersal, all organisms in this example 
are in the source patch. So one must be clear about when 
in the life cycle one is speaking. An organism 'deciding' on 
whether to disperse or not is obviously at the pre-dispersal 
phase. But the post-dispersal phase is the phase that Mc- 
|Namara and Dall| use to measure the 'multiplier effect'. 

Examination of the results of McNamara and Dall also 
reveals that when they speak of an organism being "liable 
to be in favourable circumstances," what they actually 
mean is that an organism is more likely to be in a fa- 
vorable habitat than it would be if there were no growth 
advantage there, not that the organisms is actually liable 
to be there. This is the concept that I make precise as 
the fitness- abundance covariance. Even in this relative 
value of abundance, however, we will see that the fitness- 
abundance covariance is not always positive. 

The fitness-abundance covariance relates three different 
sets of values: the environment-specific growth rates Di, 
the stationary distribution in the presence of differential 
growth rates, referred to as Vi, and the stationary distri- 
bution in the absence of differential growth rates, referred 

tO as 7Tj. 

The stationary distribution for recursion z(i + 1) = 
MDz(t) satisfies 



P v 



MDv, 



where v is the eigenvector of MD associated with the 
largest eigenvalue of MD, p. This is called the right Per- 
ron vector. Throughout, v(A) will represent the right 
Perron vector of a matrix A (see Table [l]). 

The magnitude of p determines whether the population 
grows (p > 1) or declines (p < 1) or is stationary (p = 
1), and ecological models typically impose some kind of 
negative density dependence so that as population size z 
gets large enough, p decreases with z, and a stationary 
state of p = 1 can be attained. The problems addressed 



Table 1: Definitions and Symbols 

A,M,D,P,S or other boldface capital letters represent 
nxn matrices, and v, x, y, e, or other bold face lower 
case characters represent n-vectors; the identity ma- 
trix is I; a scalar matrix is cl for c £ R; 

Aij = [A]ij represents the elements of A, i, j = 1, . . . , n, 
and Xi represents the elements of x; 

Di = [D]u represents the diagonal elements of diagonal 
matrix D; 

a positive diagonal matrix has Di > 0, i = 1, . . . , n; 

[A] 1 represents the ith row of matrix A, and [A]j repre- 
sents the jih column. 

e represents the unit vector, where all elements are 1; 

ej represents the jth basis vector, which has 1 at position 
j and elsewhere; 

diag[x] = D x is a diagonal matrix of the vector x; 

A T , z T , e T , etc., represent the transpose; 

\i(A) = A,4i, i = 1 • • • n represent the eigenvalues of A; 

symmetrizable to S means that an n x n matrix can be 
represented as a product A = LSR, where S is a 
symmetric real matrix, and L and R are positive 
diagonal matrices; 

stochastic means an n x n matrix with nonnegative ele- 
ments and whose columns (by convention here) sum 
to one (column stochastic); 

positive definite means a matrix that is symmetric and 
has only positive eigenvalues; 

irreducible means an n x n nonnegative matrix where for 
every i,j there is some t such that [A ]y > 0; 

p(A) := max; A; (A) represents the spectral radius, the 
largest modulus of any eigenvalue of A. 

Ai(A) by convention will refer to the Perron root of a 
nonnegative irreducible matrix A, which is the posi- 
tive eigenvalue guaranteed by Perron-Frobenius the- 



ory (Scneta 2006 Theorems 1.1, 1.5) to exist, to be 



the spectral radius, and to be as large as the mod- 
ulus (i.e. magnitude) of any other eigenvalue. So 
Ai(A)=p(A)> |Ai(A)| for i = 2,...,n. 

v(A) and u(A) T represent the right and left Perron 
vectors of nonnegative irreducible A, the eigenvec- 
tors associated with the Perron root, guaranteed by 
Perron-Frobenius theory to be strictly positive. So 
Av(A) = p(A)v(A), and u(A) T A = p(A)u(A) T . 
By convention e T v(A) = 1 and u(A) T v(A) = 1. 

v = v(A), u = u(A), and p = p(A), throughout, where 
A is obvious from context. 

7r = v(P) traditionally represents the stationary distri- 
bution of irreducible (column) stochastic matrix P. 

The harmonic mean of a set of numbers {n} is 



E H ( Ti ) := 



iE 1 

11 ' T ■ 
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here do not concern the absolute value of p, but only the 
relative changes to p and v under changes in M and D. 
For a general treatment of negative density dependence, 



Kirkland et al. (20061 provide a thorough analysis. 



The stationary distribution in the presence if differen- 
tial growth rates depends on the phase in the life cy- 
cle at which the census is taken. The life cycle consists 
of alternation between differential growth and dispersal, 
. . . DMDMDM .... When censused just after dispersal, 
the stationary distribution is v(MD). Censused just be- 
fore dispersal it is v(DM). 

We see that v(MD) and v(DM) have a simple relation- 
ship from the cyclical structure. Dv(MD) is the Perron 
vector of DM up to scaling, since 

D[MD v(MD)] = p(MD) D v(MD) 

When scaled to satisfy e T v(DM) = 1, one gets the rela- 
tionship: 



v(DM) 



1 



p(MD) 



Dv(MD) 



(10) 



It should be noted that in continuous-time models such 



as quasispecies (Eigen and Schuster 1977), selection and 



transformation happen simultaneously so there are no sep- 
arate life cycle phases, hence no distinction between pre- 
and post-dispersal stationary states. 

For semelparous organisms with discrete, non- 
overlapping generations, the fitness-abundance covariance 
is now defined for both phases of the life cycle. 

Definition (Fitness- Abundance Covariance). 

The fitness-abundance covariance is defined as the 
unweighted covariance between the environment- specific 
growth rates and the excess of the stationary distribution 
above the distribution that the population would attain in 
the absence of differential growth rates: 

1. Post-dispersal: 

TA(MD) := Cov(A,w;(MD) - »j(M)) 

n 

= -rA(n(MD)-«s(M)) 
n * — ' 

i=i 



n i n 

^E^ -Em md )-^ m ))- 



i=l 



2. Pre- dispersal: 

TA(T>M) := Cov(A,«i(DM) - « 4 (M)) 

n 

= TA(»i(DM)-»i(M)) 
n f— i 

4—1 

n -. n 

--E^ -E(^ DM )-^ M ))- 

i=l i=l 



Several elementary results are described. The first 
shows that the relationship between the pre- and post- 
dispersal fitness-abundance covariances is Fisher's Funda- 
mental Theorem of Natural Selection in a slightly new 
context. 

Theorem 3 (Fitness-Abundance Covariance and Census 
Phases). 

Let M be an irreducible column stochastic matrix and 
D a positive diagonal matrix. 
Then 



Fi(DM) = Jyl(MD) 



1 



-Var v (A), 



np(MD) 

where Var v (A) is the v(MD) -weighted variance of T) i7 

n / n \ 2 

Var v (A) := £ «i(MD) Df - £ v l {MB)D l . 

i=l \i=l / 

Proof. Here, n = v(M), v = v(MD), and p = p(MD). 
We first note that 

n 

^2 D i v i = e T D v = e T MD v = p e T v = p. 
TA{MD) = Cav(Di,Vi - vr,) 

^ n ^ n ^ n 

= - E D *( v * ~ n *) — E- ^ E^ ~ n j) 



n e — ' n 



(ii) 



Substitution with (10) and (11) gives 



Ji(DM) = Cov(A,^(DM) - IT,) 



n n 



n 

i=i 

if 1 

n yp 

H 1 - 

n yp 

1 



i=l J=l 
n n \ 



(12) 

p-Y^D^A (13) 



n p 



Var v (A)+Jv4(MD). 



Corollary 4 (Derivatives of Fitness- Abundance Covari- 
ances and p). 

Let M(m) be a family of irreducible stochastic matri- 
ces, differ entiable in m, and assume v(M(m)) = ir for 
all m 6 (0, 1] . Let D be a positive diagonal matrix. Set 
p = p(M(m)D). 

Then 



/-^(M(m)D)=^, 
dm n dm 



(14) 



June 23, 2011 



12 



and 



dm 



J=A(T>M(m)) 



1 / dp 
n \ dm 



1 ~Va,x v (D i ) - ivar v (A) 
p dm p z 



(15) 



For a counterexample to the positivity of the pre- 
dispersal fitness-abundance covariance, we try a transition 
matrix M that is as far from Theorem [5] as possible, so 
the states are periodic and the eigenvalues other than 1 
are complex roots of unity. This represents the situation 
of pelagic organisms along a gyre (e.g. Cowen et al.|2006 |. 



Proof. Differentiation of (111]) and (TT3b directly gives (Il4| Theorem 6. Let M be an n-cyclic matrix, 



and (151 



By plain intuition we would expect the fitness- 
abundance covariance to be positive. But |McNamar a~and| 
Dall found circumstances in which the fitness-abundance 
covariance is negative just after dispersal. The question 
then remains, what about just before dispersal, when indi- 
viduals are still in the environment whose growth rate they 
just replicated under? Here is where intuition suggests the 
fitness-abundance covariance should be positive. This is 
proven to be the case when M is the transition matrix of 
a reversible Markov chain with positive eigenvalues. But a 
counterexample is provided when M represents a periodic 
chain that cycles through the states, which has complex 
eigenvalues. 

Theorem 5 (Positivity of the Pre-Dispersal Fit- 
ness-Abundance Covariance) . 

Let M be the transition matrix of an ergodic reversible 
Markov chain, with only nonnegative eigenvalues. Let 
D ^ cl be a positive diagonal matrix. 

Then 

Ji(DM) := Cov(A,«i(DM) - «i(M)) > 0. 
Proof. Here, tt = v(M), v = v(MD), and p = p(MD). 



From (12), 



Ji(DM) > <S=> ^Dfvi > p^D.tt 

1=1 2 = 1 

Since D ^ cl, and v > 0, 

n n 

Var v (A) = D i v * ~ E DiVi ¥ > °- 



(16) 



The condition that M be the transition matrix of an er- 
godic reversible Markov chain is equivalent to it being 



diagonally similar to a symmetric matrix (Keilson 1979 
Proposition 1.3B; |Altenberg||2011| Lemma 2). Since M 
has all nonnegative eigenvalues, that matrix is positive 
semidefinite. This allows application of the inequality in 
|Friedland and Karlin| ( |1975[ Theorem 4.1): p(DM) > 
S"=i ATfj- In (|16[) this gives 



M = 



Then for D ^ cl, 



"0 





• 


• 


r 


1 





• 


• 








1 


• 


• 









1 













• 










• 


• 1 


0_ 



J=A(MD) 




< 



(17) 



Proof. For a matrix cyclic in this direction, given any 
z > 0, [MDz] (4modn)+1 = D iZi . Thus (MD)"z = 
(nr=iA)z. So (MD)-v(MD) = (nr=iA)v(MD) = 
p(MD) ,l v(MD). Hence p(MD) = H" =1 D] /n . Substitu- 
tion in (11) gives (17). Jv4(MD) is negative because it is 
1/n times the difference between the geometric and arith- 
metic means of A, which is always negative if not all A 
are equal (|Steele| [2004} pp. 20-26). ■ 



Theorem 7. When the states are transformed in a cy- 
cle, it is possible for the pre-dispersal fitness- abundance 
covariance to be negative. 

Proof. An example is constructed. Let M represent the 
period-3 cycle of states M2->3->l... 



M 



1 

1 
1 



and let D = diag A, A, A 

The spectral radius is p — (AA A) 1 ^ 3 - By symme- 
try, TTi = 1/3, i — 1,2,3. Symbolic computation with 
Mathematica™ shows that 



v(DM) 



i + a? /3 /(A 1/3 £>3 /3 ) 
i + (A 1/3 A 1/3 )/- D 2 /3 
i + ( a 1/3 A 1/3 )/^ /3 



(A 1/3 A^ /3 )/^i 

a 2/3 /(a 1/3 a 1/3 : 
a 273 /^ 73 ^ 73 : 



2/3 



A numerical survey shows that Ji(DM) is positive 
over most values of (A, A) A) except for a very nar- 
row range of D near the boundary where Ji(DM) 
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becomes negative. One such value is (D±, D 2 , D3) = 
(1/40000,1,1/8), which yields p = 0.0146, v(MD) T = 
(0.894,0.0015,0.105), v(DM) T = (0.0015,0.105,0.894), 
and Jvt(DM) = -0.159. ■ 

The point of this odd counterexample is not that it rep- 
resents something we might find in nature, but rather to 
say that we cannot entirely trust our intuition about the 
fitness-abundance covariance, and that something more 
subtle is going on mathematically than we might suppose. 

6.2 Individual Stationary State Frequencies 

Let us now examine the relationships between individual 
values of Uj , Di and 7Tj . 

IMcNamara and Dalll show for the case of n = 2 that the 
post-dispersal fitness-abundance covariance is positive or 
negative depending on the durations of the environments, 
restated in their terms here: 



Theorem 8 (McNamara and Dall (2011 online Appendix 
A, Theorem A)). Let n = 2 in (|7j) to give 

1. Ifr^ 1 + t 2 _1 < 1 then p(MD) > £)? =1 Dm and 

(a) Di < D 2 => «a(MD) > «a(M) = tt 2 

(b) D X >D 2 => d 2 (MD) < «a(M) = tt 2 . 



2. Ifr^+T 2 - 



1 thenv t (MT>) 
^2 



1,2, 



and p(MD) = £i=i Dm- 
3. Ifr^ 1 + 7-- 1 > 1 then p(MD) < £- = i and 
(a) D 1 <D 2 =^ v 2 (MT>) < v 2 (M) = tt 2 



(b) D t > D 2 



v 2 (MD) > v 2 {M) = ir 2 . 



The third case exhibits the very counterintuitive behav- 
ior that increasing the reproductive output of an environ- 
ment will lower the stationary proportion in that environ- 
ment. We can compare this result to the following general 
theorem on how changes to a matrix affect its Perron vec- 
tor: 



Theorem 9 ( |Elsner et al.| ( |1982[ Theorem 2.1)). 

Let A be an n x n nonnegative irreducible matrix. Then 
for any nonnegative n-vector a >^ ; i =/= j 6 {1, . . . , n}, 



Ui(A + eja T ) Uj(A + e 4 a T ) 



(18) 



Vi(A) Vj(A) 
It is more useful for us to put it in the following form 



Corollary 10 (Change in the Perron Vector). 

When normalized to frequencies, e T v = 
Ui(A + eia T ) > Vi{A). 



1, then 



Proof. The result follows immediately from rearrangement 
of (18) and summation using e T v = 1: 

e,a T ) 



3^1 



Vj (A) 



< 5 «i(A) 



1 --u t (A + e t a' ) 
Vi(A + e^) 

1 ~ Vi(A) 
Vi(A) • 



In this case, the behavior of the Perron root follows our 
intuition that increasing the ith row of A should increase 
the stationary proportion of Vi. 

Something must be very different, therefore, between 
theorems [8] and |9j since they both deal with changes in the 
Perron vector when elements of the matrix are changed. 
Theorem [8] produces counterintuitive results that depend 
on T 1 _1 +r 2 _1 , while Theorem[9]has no conditions on details 
of the matrix. How can this discrepancy be reconciled? 

We must write A in terms of M and D to compare the 
two results. Let [M] 4 be the ith row of M. We can write 



e l a v 























D 2 








































Di + e 








































D n 



M 



10 



shows 



where A = DM and a = e[M]\ e > 0. Corollary 
that increasing the reproductive output of environment i 
from Di to Di + e increases the stationary proportion in 
environment i. In the limit e — > 0, this gives: 

Corollary 11. For irreducible column stochastic matrix 
M and positive diagonal matrix D; 



aw„(DM) 
dD K 



> 0. 



In the case n = 2, we have D 2 > D\ 
e > 0, a T = e(M 21 ,M 22 ), and 



(19) 



D 2 = D 1 +e, 



e 2 a T = 



D 1 
Dx + e 



M = DM. 



So Theorem [9] gives w 2 (DM) > w 2 (M) regardless of any 
details of M. 

The discrepancy is resolved by noticing that the order of 
M and D is reversed between Theorem [8] and Theorem |9] 
The difference between the two is essentially in the phase 
of the life cycle at which the population is censused. 



We can contrast ( 19 ) with the following 



Corollary 12. For irreducible column stochastic matrix 
M and positive diagonal matrix D, 



3D 



" logK(MD))> 



3D, 



log 



^(MD) j 



(20) 
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Proof. Substitution of ( 10 ) in ( 19 ) and differentiation 
gives: 

9v K (DM) d fD K v K (MD) 



< 



< 



3D, 



dD K 
-1 dp 

z 7*9d; 

-1 dp 

JdD~ K + TT K 



D K v K 



1 



dD K V P(MD) 

1 , l n dv K 
-v K + -D K — - 
p p dD h 

1 dv K 



dD K 



-(log(p) - log(A«)) < 



dD, 



log(««). 



Comparing ( 19 ) and (|20| we see that the stationary dis- 



tributions at different census phases behave differently. 
To summarize: 

• the portion in environment i censused before dispersal 
always increases with growth rate Di\ 

• the portion in environment i censused after dispersal 
can, under the right environment transition matrix, 
decrease with increasing growth rate Di. 

This allows us to make inferences on the duration of the 
environments based on changes in the proportions of the 
population in each environment before and after reproduc- 
tion: 

Corollary 13 (Census Inference on Durations of Envi- 
ronments). 



Consider the model of McNamara and Dall\ z(t + 1) = 
M(m)Dz(t), withM(m) = [(l-m)P+rmz e 1 ] 0, where 
it = Ptv = v(P). At a stationary distribution, let v = 
v(MD) be the vector of proportions of individuals in each 
environment before reproduction, andv R = v(DM) be the 
proportions after reproduction. For n = 2 environments, 

1. If vf > iv i, we know that D\ > D^. 

2. If in addition, v\ < %x, then we know ^//(ti, t 2 ) < 2; 
or if Vi > it i, then Eh{t\,T2) > 2. 

6.3 Generalization of The McNamara and 
Dall Model 

We shift now from general M to the specific model of dis- 
persal in randomly changing environments of |McNamara| 
and Dall (2011). First, we see how the direction of selec- 



tion on unconditional dispersal corresponds to the sign of 
the post-dispersal fitness-abundance covariance. 



Corollary 14 (McNamara and Dall Model with general 
n). 

Let M(m) := [(1 — m)P + ni7re T ], where P is an irre- 
ducible stochastic matrix, and Ptv = tv. Let D ^ cl be a 
positive diagonal matrix. Set v = v(MD). Then 



TA(M&) = Cov(A, Vi - 7Ti) > 

^> — p(M(m)D) < 0, 
dm 



(21) 



(the reduction phenomenon) and 

JA(MD) = Cov(A, Vi - -Ki) < (22) 

- d -p(M(m)D) > 0. (23) 
dm 

(departure from reduction). 

Proof. At m = 1, 

M(1)D v = 7re T D v = p(M(l)D) tt, 

so v = 7r, hence Cov(Di, Wj(M(l)D) — 7Ti) = 0. Form < 1, 
121b and ^ follow from 1Mb. ■ 



Remark. It should be noted that this correspon- 
dence between the reduction phenomenon and the sign of 
the post-dispersal fitness-abundance covariance is specific 

model, 



to 



McNamara and Dall 



s model, M(m) = (1 — m)P + 
mv (P)e' . Shortly we will examine the more general 
M(m) = P[(l - m')I + mQ], in which v(PQD) ^ v(P) 
generically, so Cov(D,, ^(M(l)D) - v,(P)) ^ 0. Thus 
departures from reduction do not necessarily correspond 
to a negative fitness-abundance covariance. For the gen- 
eral open problem M(m) = (1 — m)A + mB ([5]), it is 
not at all generic for v(AD) = v(A) or v(BD) = v(B), 
hence there is no general relationship between the reduc- 
tion phenomenon and the sign of the fitness-abundance 



covariance. 



Next, the expression in McNamara and Dall (2011) in- 
volving the durations of the environments, rf + r^ 1 , is 
generalized to the harmonic mean of the durations of n 
environments. The harmonic mean of the expected dura- 
tions of states in a Markov chain (expected run lengths or 
'exit times') is shown to have a fundamental relationship 
to the sum of the eigenvalues (the trace) of its transi- 
tion matrix, an identity whose earliest reference I find is 
Shorrocks| ( fl978| p. 1017), cited by |Geweke et al.| ( |1986[ ): 



Lemma 15 (Markov Chain Harmonic Mean and the Trace 
of the Transition Matrix). 

For a Markov chain with transition matrix P, let rj be 
the expected duration of state i (the mean length of runs 
of i), and let En{Ti) be their unweighted harmonic mean. 
Let Aj(P) be the eigenvalues of P . These are related by 
the following: 



E H (n) 



l 



l 



11 1 

n L — ^ T- n ' * 



> l, 



(24) 



n f— ' t, 



or, equivalently 

1 - 1 

EiUP)) := - Ai(P) = 1 - / \ > 0. (25) 
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Proof. The expected length of runs of any environmental 
state i is (Prais 19551 

oo 

Ti := ^(duration of i) = ^tP*r x (l - P u ) 

t=o 

oo oo oo 1 



t=0 



t=l 



t=0 



-Pi,: 



Since P^ > 0, then n > 1 so Ejj(Ti) > 1. Since the trace 
of the matrix is X)"=i = Aj(P), we have 

n n n /l\ n l 



i=l i=l 



i=l 



Rearrangements of the terms gives ( 24 1 and ( 25 1 . 



The main result is now presented, tying together the 
eigenvalues of the environment transition matrix, P, the 
effect of dispersal on the population growth rate, and the 
harmonic mean of environment durations. The result gives 
sufficient conditions in terms of the eigenvalues of P for 
departures from the reduction phenomenon. 

This theorem must sacrifice some generality in the envi- 
ronment transition matrices in order to obtain tractability. 
The environmental change process must be a reversible 
Markov chain, which means it does not exhibit direc- 
tional cycles (which requires complex eigenvalues in P), 
which might be thought of as 'currents' through the set 
of states. This is the symmetrizability constraint that ap- 
pears in Karlin's Theorem 5.1, and is required for technical 
reasons described in Methods. It remains an open prob- 
lem whether the following theorem extends to all ergodic 
Markov chains. 

Theorem 16 (Eigenvalues, Reduction Phenomenon, and 
Harmonic Mean of Environment Durations). 

Let P and Q G K n,n be transition matrices of reversible 
ergodic Markov chains that commute with each other. Let 
Ti = 1/(1 — Pa) be the expected length of runs of state i 
under iteration of P . Let 



M(m) := P[(l - m)I + mQ], 
and D ^ cl be a positive diagonal matrix. 
1. If all eigenvalues ofP are positive, then 
d 



(26) 



dm 



p(M(m)D) < 0, 



(the reduction phenomenon) and 



E H {n) > l 



(27) 



(28) 



2. If all eigenvalues of P other than the Perron root 1 
are negative, then 



dm 



p(M(m)D) > 0, 



(29) 



(departure from the reduction phenomenon) and 

1 



i<E H { Tl ) < i + 



n — 1 



(30) 



The proof is given in Methods section |7.1| 
Remark. One should be careful here not to in- 
terpret this result as an implication from Eu{Ti) to 
dp(M(m)D)/dm. While it would be ideal to derive condi- 
tions for the reduction phenomenon from conditions on the 
durations of the environments, this is not possible here for 
n > 3; rather, both implications derive from the condition 
on the eigenvalues. 

However, for n = 2 environments, the implication be- 



comes possible, as seen in this slight generalization of Mc 



Namara and Dall (2011 Theorem B Corollary): 



Corollary 17. Let P and Q be 2 x 2 irreducible stochastic 
matrices that commute, and D ^ cl be a positive diagonal 
2x2 matrix. Then 



d 
dm 



p(P[(l - m)I + mQ]D > 0, 



if and only if P\ 2 + P 2 i > 1, or equivalently, Eh(ti,t 2 ) < 
2. 

Proof. In the case of n = 2, there is only one other eigen- 
value, A2 = 1 — (P12 + P2i)- We have 



E h (t 1 ,t 2 ) 



1 



l-i[l + l-(Pl2 + P 2 l)] P 12+P 2 1 



So P 12 
by Theorem 



^2L> 1 E h (t u t 2 ) < 2 

dp(MD) 



1G 



A 2 < 0, and 



dm 



> 0. 



of McNamara and Dall is a special case 



and Corollary 17 Note that all irreducible 



The model 
of Theorem [16 

stochastic 2x2 matrices are transition matrices of ergodic 
reducible Markov chains: 

Corollary 18. Theorem \16] includes, as special cases, 
Q = P* for t > 1, and Q = P°° = 7re T , where it = Ptt 
is the stationary distribution of P. 

Proof. P and P* commute, as do P and P°° = 7re T , 
since P7re T = 7re T = 7re T P. When P is the transition 
matrix of a reversible ergodic Markov chain, so too are P* 
andP°°. ■ 

Theorem [16] is able to give us results only for the ex- 
trema of the distribution of eigenvalues of the environment 
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transition matrix, where all non-Perron eigenvalues are ei- 
ther positive or are all negative. It would be desirable to 
obtain some results for the interior region where there a 
mixture of positive and negative eigenvalues. Short of this, 
the location of the interior region can at least be made pre- 
cise in terms of its relation to the space of matrices whose 
non-Perron eigenvalues are all negative. 

This is done by forming a path from such extreme P to 
the opposite extreme, I — a homotopy [0,1] i-> {M} C 
l" x ". For M(m), the two endpoints of the path are 
M(0, m) = (1 - m)I + mQ. and M(l, m) = P[(l - m)I + 
mQ]. The path can be easily created by a convex com- 
bination, M(a,m) = [(1 - a)I + aP][(l - m)I + mQ], 
parameterized by a € [0,1]. It is straightforward from 
Lemma [15] that 



E H (Ti([l-a]I + aP) 



1 



o 



so as a goes to 0, the harmonic mean of the environment 
durations goes to infinity. Using the homotopy between 
two extremes, we will now see that, for any P, there is 
always some a below which the reduction phenomenon 
holds. 

Corollary 19 (Convex Combination with an Extreme 

P)- 

Let 

M(a,m) := [(1 -a)I + aP][(l-m)I + mQ], 

where P and Q be transition matrices of ergodic reversible 
Markov chains that commute with each other, and a,m £ 
[0, 1]. Let D 7^ cl be a positive diagonal matrix, 
d 

Suppose that — — p(M(l, m)D) > form £ (0, 1]. Then 
dm 

there exist critical values ao,ai with 1/2 < ao < ot\ < 1, 
such that for m £ (0, 1], 



_d_ 
dm 



and 



p(M(a, m)D) < for a £ [0,a ), 



p(M(a, m)D) > /or a e (ai, 1]. 



dm 

Proof. M(a,0) = (1 — a)I + aP plays the role of P in 
Theorem 16 so we need to know when Ai(M(a, 0)) > Vz. 



Let A min (P) := mini A,(P). Then 

A min ([(l - a)I + aP]) = (l-a)+ a\ min (P) > ^ 
A min (P) > (a - l)/a = 1 - 1/a. 

By Perron-Frobenius theory, irreducible stochastic P 
means A min (P) > -1. So -1 > 1 - 1/a (i.e. a < 1/2) 
assures A;(M(a, 0)) > Vz. Thus ao is no smaller than 
1/2. 

Since p(M(a, m)D) is a continuous function of a and 
m, we know <9p(M(a, m)D) / 'dm > for a in some neigh- 
borhood (ai, 1]. ■ 



So now we have characterized the interior regions as 
[ao, oi] where the behavior of <9p(M(a, m)D) /dm needs to 
be characterized. We do not know, for example, if dp/ dm 
keeps the same sign for all m at a given a, or whether it 
can change sign more than once on [ao, ai], and so forth. 
It remains an open problem. 

Next, a particular kind of environmental change pro- 
cess is considered in which there is no causal connection 
between sequential environments. In other words, when 
the environment changes, it has no memory of its previ- 
ous state. In genetics this is Kingman's 'House of Cards' 

1980| ) ). It is shown 



model of mutation (Kingman, 1978 



for such a memoryless environment that the reduction phe- 
nomenon is the only possible outcome. 

Theorem 20 ('House of Cards' Environmental Change). 



Let M(m) be defined as in (261 of Theorem 16 Let D ^ 



cl be a positive diagonal matrix. Suppose that when the 
environment changes, its current state has no influence on 
its next state. Suppose further that the expected duration 
of an environment is ti = r for all environments i. 

Ifr = l, then ~p(M(m)D) = 0. 
dm 

Ifr>l, then — p(M(m)D) < 0. 
dm 

The proof is given in Methods section |7.2| Note that in 
this case, r > 1 becomes a sufficient condition for selection 
for reduced dispersal, which it is not in the general case 
in Theorem 1161 

6.4 The Conditional Dispersal Model 

The result of McNamara and Dall| ( [2"011[ ) that drew partic- 
ular attention was their finding that, under a broad range 
of circumstances, it is better for the organism to ignore 
cues about the environment and instead follow philopatry. 
The general form for their cue model is a modification of 
0: 

MD = P[(I - C) + 7re T C]D, 

where C is a diagonal matrix of the conditional dispersal 
probabilities, C,, that an individual disperses given it is in 
environment i. 

A change in an organism's response to cues about its 
environment is manifest as changes to Cj, hence the ob- 
ject of interest is the change in asymptotic growth as the 
conditional dispersal rate is changed: 



_d_ 

dC, 



■p(P[(I-C)+7re T C]D). 



The next result shows that an organism should increase 
its dispersal from any environment where its destinations 
correlate better with left Perron vector u(MD) than does 
staying put, and that in general there is always at least 
one such environment. 
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Theorem 21 (Conditional Dispersal). 
Let M := P[(I — C) + ire C], where P oe an irreducible 
stochastic matrix, C and D are positive diagonal matrices, 
with Ci G (0, 1), and tv = Ptv. Refer to the left and right 
Perron vectors as u = u(MD) and v = v(MD). Then: 

1. The derivative of the spectral radius with respect to 
each C K is 



d 
8C, 



■p(P[(I-C) + 7re T C]D) 

71 

= D K v K n [Cov(u l , -Ki) - Cov(u l , P iK )] . 



(31) 



2. There is always at least one k for which 

9 -p(P[(I - C) + 7re T C]D) > 0, 



dC, 



and at least one k for which 
d 



p(P[(I-C)+ 7 re l C]D) < 0, 



unless D = ci or P = 7re T . 

3. The gradient of the spectral radius with respect to C 



V c p(MD) := 



<9p(MD) 

dC K 



u T (7rv T -PD V )D. 



4- There is always a subspace, M, of perturba- 
tions of C that are neutral for p(MD). M = 
{£: Vcp(MD) £ = 0} is an n — 1 dimensional 
linear subspace. Its basis includes strictly positive 
C = D~ 1 D _1 7T ) i.e. 



D K v K 



(32) 



The proof is given in Methods section 7.3 
Theorem [21" 



shows that an organism can increase 
its asymptotic growth rate by dispersing more from 
any environment k for which Cov(ui(MD), 7r,-) > 
Cov(ui(MD), Pi K ), which means that its environment if it 
disperses (distributed as 7r), correlates better with u(MD) 
than its environment if it stays put (distributed as [P] K ). 

Furthermore, there is always at least one such environ- 
ment where increased dispersal is advantageous, and at 
least one other environment where decreased dispersal is 
advantageous, unless dispersal is neutral. Dispersal is neu- 
tral when growth rates are the same in all environments 
(D = ci), or the present environment has no influence on 
the next environment (P = 7re T ). 




Figure 1: Gradient of the asymptotic growth rate, p(MD), 
over (Ci,C 2 ) S [0,1] 2 . Lighter means higher p(MD). Model 
parameters: (L>i,D 2 ) = (1.0,0.5), (pw.pai) = (0.204,0.107). 
Diagonal dashed line G\ = C2 corresponds to variation in 
unconditional dispersal rates. Perturbations are away from 
(Ci,C 2 ) = (0.1,0.1). Regions A+B+C increase p(M D). Re- 
gions D+E decrease p(MD). |McNamara and Dall| constrain 
variation to fall within the parallelogram, with slope e/(l — e) 
for the bottom, and (1 — e)/e for the side, where e is the er- 
ror rate for environmental cues; the error rate must be small 
enough for the parallelogram to enter region C for condi- 
tional dispersal to evolve. But the unconstrained ESS here 
is (CijCi) = (0,1), and mutants anywhere in regions B+C 
increase conditional dispersal rate C2 and are advantageous. 



We see that Theorem 21 provides another situation that 
departs from the reduction phenomenon. Conditional dis- 



persal is analogous to directed mutation (Cairns et al. 



1988 Hall 1990 Lcnski and Mittlcr 1993). I would not 



go so far as to say it exemplifies the principle of partial 
control, which was conceived for the situation of multiple 
undirected transformation processes, such as recombina- 
tion in the presence of mutation. It is possible, however, 
to view conditional dispersal as control over only a part of 
the set of dispersal probabilities. 

The existence of this departure from reduction holds 
for any environment transition matrix P 7^ tv e T . The en- 
vironment may even be constant, P = I, in which case 

d 

as long as D / ci so u / e, then — p([(I 
7re T C]D) > for every k where 



dC K ' 
u K is below the 



weighted average Y^i=x u i' K i ( see (31 1). 

Philopatry is clearly not the evolutionarily stable state 
(ESS) when there is environmental change, since there is 
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always at least one environment where conditional disper- 
sal is advantageous. To understand how McNamara and 
|Dall| conclude that philopatry can be an ESS, it is helpful 
to look at the entire 'adaptive landscape' of p(MD) as a 
function of conditional dispersal rates {C\, Ca). 

|McNamara and Dall| posit a species that can vary its 
probabilities, (pi,p 2 ), of dispersing in response to a binary 
cue, where environment i produces the wrong cue with 
probability ej. The conditional dispersal rates are thus 



(33) 



Assuming that the species can vary (pi,p 2 ) over the range 
[0, 1] x [0, 1], the variation in {C\, C^) falls within the par- 
allelogram in Figure [TJ 

The example depicted in Figure [l] has a moderate rate 
of environmental change and differential growth between 
two environments. Darker means smaller p(MD). Vari- 
ation along the white diagonal line represents uncondi- 
tional dispersal, and the gradient exhibits the reduction 
phenomenon. The contour lines of constant p(MD) are 



c{ 




I -ex ei 




Pi 




Cmin 


c 2 _ 




£2 1 - £2_ 






+ 


Cmin 



shown by ( 32 ) to have slope 



_7Ijj_ 

D 2 v 2 



I 



7Tl 

D x vi 



TT 2 D 1 V 1 
■K\D 2 V 2 



The dispersal rates of the resident population are 
(Ci,C 2 ) — (0.1,0.1), the minimal dispersal attainable. 
The labeled regions A, B, C, D, and E demarcate the be- 
havior where (Ci, C 2 ) departs from (0.1, 0.1). Any mutant 
that falls within regions A, B, or C is advantageous. Re- 
gions B and C comprise increases in conditional dispersal 
from environment C 2 . The slopes of the sides of the par- 
allelogram derive from the cue error rate, e. 

Advantageous mutants arise only from in the intersec- 
tion of the parallelogram and region C. For the intersec- 
tion to be non-empty, e must be small enough. When the 
error rate is so high that the parallelogram is contained 
entirely in region D, then the ESS is the lower left corner, 
the minimal value of dispersal. 

We can see that the ESS is very sensitive to the error 
rate, however. With a slight decrease in e, the ESS can 
shift from the lower left corner to the upper left corner of 
the parallelogram, which describes McNamara and Dall s 
result. 

Therefore, in this adaptive landscape, it becomes very 
important how well the genetic and developmental sys- 
tem fills out the parallelogram with heritable variation. If 
the variation does not fully fill it out, at least two novel 
outcomes become possible. A slightly convex distribution 
overlapping region C would yield an intermediate level of 
dispersal as the fittest that occur. A slightly concave dis- 
tribution would result in a bimodal distribution of the 
fittest phenotypes. This opens up potential for polymor- 
phisms, disruptive selection, history dependence, or evo- 
lutionary volatility in the phenotype. 



Moreover, the parallelogram is based on a particular 
model (33) of how organisms disperse. There is no cat- 



egorical exclusion of genetic variation from accessing any 
point in the square (Ci,C 2 ) € [0, l] 2 . Rather, it depends 
on the details of the organism, its capabilities, and its 
genotype-phenotype map. Any time an evolutionary out- 
come is sensitive to such details, one is bound to find in- 
teresting phenomena in the natural history. 

7 Methods 



The lengthier proofs for Theorems 16 20 and 21 are now 
provided, prefaced by preparatory results, Lemma [22] and 
Theorem [23] It is here that we encounter the tractability 
afforded by using the transition matrices of reversible er- 
godic Markov chains, which is the key technique adopted 



from Friedland and Karlin (1975, Theorem 4.1) and Kar- 



lin's Theorem 5.1. 

It should be noted that Karlin's Theorem 5.2 is not us- 
able here, because of the presence of matrix P in M(m) = 
P[(l-m)I+mQ] . This is why it has been an open problem 
( |Altenberg] 2004[ ) . An initial inroad on this open problem 
was obtained through application of elements from Kar- 
lin's Theorem 5.1 to the analysis of multivariate, multiple 



locus mutation rate evolution in Altenberg (2011). The 
application of these techniques is further extended here. 
Theorem 2 in Altenberg ( 2011 ) — a multivariate reduction 



principle for multiple loci in mutation-selection balance 



is in fact a special case of (27) in Theorem 16 given here 



In the proofs to follow, (41 ), (43), (44) derive from Kar- 



lin's Theorem 5.1 proof. Other steps, including the use 



of a canonical form for symmetrizable M(m) (37), (381, 
(39), and (53) are drawn from the analysis in Altenberg 



(2011). Most of the remaining steps arise naturally from 
the problem, and may prove useful in other contexts. 

Theorem [16] first requires a characterization of the spec- 
tral radius of M(m)D, which relies on the canonical form 
for M(m) that exists when it is constrained to be sym- 
metrizable. 

Lemma 22 (Canonical Form for Symmetrizable M(m)). 

Let P and Q £ M. n,n be transition matrices of ergodic 
reversible Markov chains that commute with each other. 
Let 



M(m) := P[(l - m)I + mQ]. 
Then P, Q, and M can be decomposed as 
P = Dy 2 KA P K T D^ 1/2 , 

77 1 77 ' 



Q = Dy 2 KA Q K T D 7r 1 /2 ! 



and 



M(m) = Dy 2 KA P [(l - m)I + mA Q ]K T D; 



-1/2 



(34) 
(35) 



(36) 
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where Pn = Q7r = 7r, with e T 7r = 1, K is an orthogo- 
nal matrix, and Ap and Aq are diagonal matrices of the 
eigenvalues of P and Q, respectively. 

Furthermore, the first column of K is [K]i = 7T 1//2 
(element-wise square root). 

Proof. The transition matrices of ergodic reversible 
Markov chains, M, can be represented in a canonical way 
(|Keilson||1979l p. 33; lAbabneh et all 120061 p. 296; lAl 



tenberg||2011 Lemmas 1 and 2) as 



M = BKAK T B-\ 



(37) 



where B is a positive diagonal matrix, unique up to scal- 
ing, and K is an orthogonal matrix, i.e. KK T = K T K = I. 

Any such M is clearly diagonalizable since A is a diago- 
nal matrix. Diagonalizable P and Q commute by hypoth- 



esis, so they can be simultaneously diagonalized (Horn and 



Johnson 1985 Theorem 1.3.19, p. 52), which means there 



exists invertible X such that 

P = XApX" 1 and Q = XAqX" 1 . 

Hence M(m) = XAp[(l — m)I + mAg]X _1 . Combining 
these two forms, the common matrix X can be represented 
as X = BK. 

Next, it is shown that 



and 



[K} 1= ^ 2 



B = c D-y 2 , c > 0, 



(38) 



(39) 



satisfy e T P = e T and P-7T = tv. Since K is orthogonal, 
[K]i = 7r 1//2 if and only if (7r 1 / 2 ) T K = ei T , in which case, 
recalling that Xpi = 1, substitution gives 

e T P = e T Dj/ 2 KApK T D~ 1 / 2 = e x T ApK^Ti^ 1 ' 2 
= e 1 T K T D 7r 1 / 2 = [K] 1 D 7r 1 / 2 - e T , 



and 



Ptt = T>l /2 KA P K r I)„ 1/2 Tv = Di. /2 KA P K T 7T 1/2 

7T * IT TV 

= Dj/ 2 KAp ei = Dy 2 [K]i = tv. 



1/2 — 

Substitution of B — c D^' in the form ( 37 ) produces ( 34 ), 



(35), and (36) 



Remark. For any family of commuting symmetrizable 
stochastic matrices, K and B (up to scaling) are uniquely 
determined. Therefore, the only variation possible for the 
family is in Aj, i = 2, . . . , n, which means there are at most 
n — 1 degrees of freedom of variation in the family. ■ 

Theorem 23 (The Spectral Radius). 
Let P and Q G W a,n be transition matrices of ergodic re- 
versible Markov chains that commute with each other, let 



it be their common right Perron vector, and let {Xpi} and 
{Aq;} be their eigenvalues. Let 

M(m) := P[(l - m)I + mQ]. 

Let D be a positive diagonal matrix. Set v = v(M(m)D) 
and u = u(M(ra)D). 
Then 



p(M(m)T>) 



Api[(l - m) + m\ Ql ]yj, 



(40) 



where 



y=(v T D 7r 1 Dvr 1 / 2 K T D 7r 1 / 2DV; 



and K is from the canonical form in Lemma \2^ 

The left and right Perron vectors o/M(m)D are related 

by 



1 



(VD^Dv)' 



DZ X D 



u 



Proof. Canonical form ([36| is used to produce a symmet- 
ric matrix similar to M(m)D, which allows use of the 
Rayleigh-Ritz formula for the spectral radius. The expres- 
sion simplifies to a sum of terms involving the eigenvalues 
of the stochastic matrices P and Q. 

For brevity let * := KAp[(l - m)I + vt,Aq 1K T , so 
M(m) = B^B" 1 . Multiplication by B, D 1 / 2 , and their 
inverses (where the positive diagonal D ensures the exis- 
tence of D 1 / 2 and D- 1 / 2 ) gives the identities: 

p(M(m)D) = ^(B^B^D) = p^B^DB) 

= p(*D)=p(D 1 / 2 *D 1 / 2 ) = p (S), 

where 

S := D 1 / 2 ^ 1 / 2 
= D 1/2 KA P [(1 - m)I + mA Q ]K T D 1/2 . (41) 

Since S is symmetric, we may apply the Rayleigh-Ritz 



son 



variational formula for the spectral radius ( Horn and John- 
1985[ Theorem 4.2.2, p. 176): 



P(A) 



max x Ax. 

X T X=1 



(42) 



This yields 

p(M(to)D) = 

max x T D 1/2 KA P [(l - m)I + mA Q ]K T D 1/2 x. (43) 

X T X— 1 

Since M is irreducible and D a positive diagonal matrix, 
MD is irreducible, so by Perron-Frobenius theory there is 
a unique eigenvector x > that yields the maximum in 



( 43 ) , allowing us to write 
p(M(m)T>) 

= x T D 1/2 KA P [(l - m)I + mA Q ]K T D 1/2 x. (44) 
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Define 



(45) 



Substitution of (45) into (44) yields (40): 



p(M(m)D) = x T D 1/2 KA P [(l - m)I + toA q ]K t D 1/2 x 
= y T A P [(l - m)I + mA Q ]y 

n 

= ^2 -WC 1 ~ m ) + mX Qi\Vi- 
i=l 

Next, y will be solved in terms of v by solving for 
x, using the following two facts. For brevity, define 
A( m ) := Ap[(l — m)I + toAq], and write M = M(m) = 
BKA (ro) K T B- 1 : 

1. p(MD) v = MDv = BKA( m )K T B _1 Dv; (46) 

2. p(MD) x = D 1/2 KA (m) K T D 1/2 x. (47) 

Multiplication on the left by BD x / 2 in d47|, and sub- 



stitution of (46) reveals the right Perron vector of MD: 
p(MD) (BD- 1 / 2 )^ = (BD- 1 / 2 )D 1 / 2 KA (m) K T D 1 / 2 x 
= BKA (m) K T (B 1 DBD 1 )D 1/2 i) 
= (BKA (m) K T B- 1 D)(BD- 1/2 x) 
= MD(BD~ 1/2 x), (48) 

which shows that BD^ 1 / 2 }!: is the right Perron vector of 
MD, unique up to scaling, i.e. 

v = BD- 1 / 2 x = cB-y 2 D- 1 / 2 x, 

for some c to be solved. This almost finishes the solution 
of x, giving 

1. 



: D -i/2 D i/2 y- 



(49) 



The constraint x T x = 1 gives 

1 



X T X: 



(^D^D v) 



(v T Dl'Dv) 1 / 2 . 



(50) 



Substitution for x now produces the expression in the the- 
orem, 



y := 



K T DV2 X = K T D l/2l D -l/2 Dl /2 v 
C 

(v T D^Dv)- 1 / 2 K T D 7r 1 / 2 D v. 



By the same method as (48), u(M(m)D) is derived, 
using multiplication on the right by D _1 / 2 B to reveal the 
left Perron vector of MD: 

p(MD) x T (D 1//2 B _1 ) = x T D 1 / 2 KA (m) K T D 1 / 2 (D 1 / 2 B- 1 ) 
= x T (D 1/2 B _1 )MD = c* u(MD) T 



2/2 i 1 

for some c* > 0. From B = cD ff (39), we get 
(l/c*)x T DV2D^ 1/2 = u(MD) T . Substituting @ and 
noting u T v = 
v(M(m)D): 

u(M(m)D) 



1, we see the simple relationship to 



D^D v(M(m)D). 



(v T D^ 1 Dv; 



One additional property that stems from the symmetriz- 
ability of M(m) in (26) is that p(M(m)D) is convex in to. 



Theorem 24 (Convexity of p(M(m)D) in to). 

Let P and Q € K"> n be transition matrices of ergodic 
reversible Markov chains that commute with each other. 
Let D be a positive diagonal matrix and 

M(m) := P[(l - m)l + mQ]. 

Then p(M(m)D) is convex in to. 

Proof. This follows the same lines as inlKarlin ( 1982 The- 



orem F.l, p. 199). p(M(m)D) = p(S) in (|41|, and S = 
(1 - m)A + toB, where A = D 1 / 2 KA P K T D T / 2 and B = 
DVskApAqKJD 1 / 2 . The convexity of p((1-to)A+toB) 
is established by Lemma [25] to follow. ■ 

Lemma 25 (Convexity of the Spectral Radius). Let A 

and B be two symmetric matrices with unique eigenvec- 
tors x^4 and xp associated with their largest eigenvalue, 
normalized so x^x^ = xp T xp = 1. 

Then p{{l — to)A + toB) is convex in m, and strictly 
convex ifxA 7^ xp 

Proof. By hypothesis xa uniquely yields the maximum in 
(42), and likewise xp for B, and X/j for (1 — to) A + toB. 



Therefore, 

p((l — to) A + toB) = x m T ((l - m)A + TOB)x m 
= (1 - to) x m T Ax m + to i m T Bx m 

< (1 — to) x/AxjI + TOXp T Bxp = (1 — m)p(A) + mp(B) 

Equality requires x^ = x/j = xp, because otherwise, 
x/Axji > X/j Ax/j, or xp T Bxp > x^Bxj, either of 
which produces strict inequality. I 

7.1 Proof of Theorem Hoi 



Theorem 23 is now applied to the derivative of the spectral 



(51) radius. The general relation is 



^>=u(A)^v(A) 
dm dm 



(52) 



(Caswell 2000 Sec. 9.1.1) 
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This is derived for the specific case here by differentiat- 



ing Sx = p(MD)x (recall S from (41 1), and then multi- 



plying on the left by x T . Set p = p(MD) 



d(Sx) =£ 
dm 



dm 



,dx 

dm 



„ T dS. „ T dx 
dm X ^ dm 



dp 
dm 



dx 

dm 



dp 
dm 



P x 



dx 

dm 



dx 



Subtraction of p(MD) x T -^ from both sides and substi- 

dm 



tuting with (44 1 leaves: 
dp(MD) 

(1 



_ , T dS, 
dm dm 



= x 



J 



dm 



D 1/2 KA P [(l-m)I + mA Q ]K T D 1/2 x 
= ^U^KApIAq - I^D 1 / 2 *. 

Substitution with y := K T D 1 / 2 x yields the derivative of 



(401 



dp(MD) 

dm 



y T A P (A Q - I)y = V, A«(AQi - l)vl (53) 



Remark. Were P and Q not symmetrizable, but only di- 
agonalizable and commuting, the analysis would arrive at 
an expression similar to ( 53 ) except that the nonnegative 



yf terms would be replaced by products whose signs we 
do not know, preventing further evaluation. 

We know several things about the terms in the sum in 



(531 



1. Since P and Q are stochastic matrices, their Perron 
roots are 1, which here are labelled as Xpi = Xqi = 1. 

2. Xqi — 1 = 0. Thus the first term of the sum is zero. 

3. X Ql - 1< 0, for i € {2, . . . , n}, hence (X Qi - l)yf < 0. 

Since P and Q are symmetrizable, Xpi,Xqi G R. 
Since P and Q are irreducible, by Perron-Frobenius 



theory Seneta (2006 Theorems 1.1, 1.5), eigenvalue 
1 has multiplicity 1, and | Aq^ | < 1, which together 
imply X Qi < 1 for % € {2, . . . , n}. 

4. yi for at least one i G {2, . . . , n}, whenever D ^ 
cl for any c > 0. This fact will take a bit of work 
to show: Suppose to the contrary that yt = for all 
i G {2, . . . , n}. That means y = yi e±. Using c from 
(501, (51 ) becomes 



Vi ei 



c" 1 K T DZ 1 / 2 D v. 



1/2 

Multiplication on the left with D w K, and substitu- 



tion with [K]i = tt 1 / 2 ([38]) yields 

2/i D^Kex = yi Dy 2 [K]! = Vi tv = c" x D v. 



Multiplication of y\cir = Dv by M gives 

Mtt y x c = yxc-n = Dv = MDv = p(MD) v. 

Hence, Dv = p(MD) v, implying D = p(MD) I, con- 
trary to hypothesis. Therefore, D ^ cl for any c > 
implies that yi ^ for at least one i € {2, . . . , n}. 

Points[3j, and|4] above together imply that (Aq, — l)yf < 
for at least one i € {2, ...,n}. Inclusion of point [5] 



immediately implies for ( 53 ) that 



1. If X Pl > for all i, then — p(MD) < 0. 

dm 



2. If X Pi < for i = 2, . . . , n, then — p(MD) > 0. 

dm 



3. Otherwise: there may be positive, negative, or zero 

terms Xpi(Xni — l)y?, so the sign of - — p(MD) de- 

_ dm 
pends on the particular values of the terms, of which 

we know little at this point. 

Remark. Condition Xpi > for all i is equivalent to P 
being symmetrizable to a positive definite matrix, which is 
the hypothesized condition in Karlin's Theorem 5.1. The 
condition Xpi < for i G {2, . . . ,n} in case [2] happens 
to be the same as the well-known condition on the fitness 



matrix for a stable multiple-allele polymorphism (King- 



man 



1961) 



Here this appears to be coincidence, rather 
than a clue to some deeper result. However, that condi- 
tion is central to the 'viability-analogous' modifier poly- 
morphisms, where the matrix [1 — m^L =1 (from diploid 
modifier genotypes i\j), must have all negative eigenvalues 
except the Perron root to assure stability of the modifier 



polymorphism ( 


Feldman and Liberman 


and Feldman 


1986a b 


1989 


I, supportin 



tween viability coefficients and modifier values 1 



The Harmonic Mean. 
equivalent: 



The following inequalities are 



1 1 

E H (n) := < 1 



4—1 



n — 1 n — 1 



(54) 



n ^ n 

»-l<X)- = n- $^i 

i=l 1 i=l 
n n n 

1 >E F » = E A ^ p ) = 1 +E A ^ p ) 

i=l i = l i=2 

n 

0>£Ai(P). (55) 



Hence, A^(P) < for i = 2, . . . , n implies (55), or equiva- 
lcntly, (54). Conversely, A^(P) > for i = 2, . . . , n implies 
EIUMP) > 0, hence @. ■ 
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7.2 Proof of Theorem [20} 'House of Cards' 
Environmental Change 

By hypothesis, the probability that environment remains 
unchanged in one generation is a = 1 — 1/r. If the current 
environment has no influence on which environment comes 
next (Kingman's 'House of Cards' model, Kingman l p78l 
19801) then 



Pij = (1 - er)7Tj + aSij 

where Hi be the probability that any changed environment 
becomes i, giving 



P = (1 - c^Tre 1 



Since 7re T is a rank-one matrix, A;(7re T ) = for i = 
2, . . . ,ri (Horn and Johnson 1985 p. 62). Hence Aj(P) = 
a for i = 2, . . . , n. 

For the case r = 1, then Aj(P) = a = 0, i = 2, . . . , n, so 



dp(MD) 



(|53|) evaluates to 

n 

<hj> = 51 A Pl (A Ql - I)?/, 2 

i=l 

n 

= l(l-l)l£ + 0(A Qi - 1)^ = 0. 



i=2 



For the case r > 1, then Aj(P) = cr > 0, i = 2, . . . , n, so 
( 53 1 evaluates to 



dm 



= 1(1 - l)y? + 5>(A Qi - l)y? < 0, 

since D ^ cl for any c € R implies that Xq{ — 1 < 
for i € {2, . . . , n}, and by |4j in the proof of Theorem 16 
Hi for some for i £ {2, ... , n}. ■ 



7.3 Proof of Theorem [2T| Conditional Dis- 
persal. 

1. Basic identities used are dC/dC K — D 6k , e T e K = 
1, and Pe K = [P] K . The derivative formula fl52l) 
dp{A)/dj3 = u(A) T (<9A/<9/3)v(A) flCaswell| |2000 



Sec. 9.1.1), with respect to parameter /3, is applied 
to yield 

^P(MD) = ^-P(P[(I - C) + 7re T C]D) 



<9M 



u 



Dv = u T P( D 6k +7re T D e JDv 



dC K 

u t (Ptt e T e K - Pe K ) D K v K 



(56) 



u t (tt - [P] K )£> K v K = D K v K 53 Wi(7ii - P iK ) 



= D K v K n [Gov(ui,7Ti) - Cov(ui, Pi„)]. 



Note: 1/n 2 in the covariance terms cancels in (57 1, 
e.g. 

cov(ui, p iR ) = - y uiP iK - p iK 

i 

- 1 1 

77 ^ ^ ^ Z 



2. There is always at least one environment in which in- 
creased dispersal is advantageous, and at least one 
environment where decreased dispersal is advanta- 
geous: If no environment selects for increased dis- 
persal, that means <9p(MD)/dC K < for all k, hence 
u t (tt - [P] K ) < 0, or, combined, u T (7re T - P) < T . 
Then u T (7re T - P)tt < 0. But 

U T (7re T — P)7T = U T 7T — U T 7T = 0, 

so u T (tt — [P] K ) = for all k, which implies either 
tt = [P] K V K since u > 0, or u T = e T which requires 
D = cl for some c £ K. If neither tt = [P] K V k nor 
D = cl, then there must be some k for which u T (7r — 

[P] K ) > 0, hence t^p(P[(I - C) + 7re T C]D) > 0. 

The parallel argument follows when < is replaced by 
> above. 

Remark. When n = 2, then it must be the case that 
the spectral radius is maximized at either (Ci, G%) = 
(1,0), or at (Ci,C 2 ) = (0,1). This is illustrated in 
the numerical example in Figure [T] 

3. A row vector is made from d 56h , over k: 



V c p(MD) := 



~dp(MD)~ 


K=l 




dC K 


n 





K=l 



(57) 



u T (P7re T -P)e K D K v K 
u T (7rv T -PD V )D. 



4. Since Af := {£: V c p(MD) £ = 0} is defined by a 
single constrain, it is an n — I dimensional linear sub- 
space. Verification is given that C = D~ 1 D _1 7T e N: 

V c p(MD) C = u T (7rv T - PD v )D(D; 1 D- 1 tt) 
= u T (7rv T DD~ 1 D _1 7r - Ptt) = u T (7rv T D~ 1 7r - tt) 
= u T (7re T 7r — 7r) = u T (7r — tt) = 0. 



8 Discussion 

There are two sets of take-home messages from the results 
here: one, content, and the other, methodology. Some of 
the content can be summarized simply as, "the results of 
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McNamara and Dall (2011 ) generalize to n environments,' 



but in that generalization, new relationships emerge that 
were not visible from the parameters with only 2 environ- 
ments. 

The content in general may be understood to be part 
of the larger body of literature on the Reduction Principle 
and the departures from it, which include domains reach- 
ing from the evolution of recombination to the evolution of 
cultural traditionalism. The environmental change model 
of |McNamara and Dall| produces some specific new results 
for the reduction phenomena. 

We see first that the "build up of the genotype on good 
sites" can be defined precisely as the fitness-abundance 
covariance — the covariance between the environment- 
specific growth rate and the excess abundance above what 
would emerge without differential growth. The phase of 
the census — whether taken before or after dispersal — is 
critical to the properties of the fitness-abundance covari- 
ance. 

Exploration of the stationary state fitness-abundance 
covariance and its dependence on census phase is made 
for general combination of stochastic M and growth rates 
D in Theorem [3J and Corollaries [4j |TTJ [l2j The only con- 
straints are that M be irreducible and growth rates 
be positive. Thus, they could just as well apply to mu- 
tation/selection balances as to dispersal/growth balances. 
Results for specific classes of M are found in Theorems [5j 
§ and[7j and Corollary [13] 

Theorem [3] shows that the process of dispersal de- 
creases the fitness-abundance covariance by the variance 
in growth rates — a version of Fisher's Fundamental The- 
orem. Corollary [4] shows that the derivatives of post- 
dispersal Jv4(MD) and of p(MD) always have the same 
sign with respect to any differentiation of M. Theorem [5] 
finds that the pre-dispersal fitness-abundance covariance, 
J-I4(v(DM)), is always positive when M is the transi- 
tion matrix of an ergodic reversible Markov chain with all 
nonnegative eigenvalues, and growth rates differ between 
environments. Reversibility is important here, because a 
counterexample with JL4(v(DM)) < is found for peri- 
odic chains where one environment has a very small growth 
rate (Theorem[7|. It is reasonable to conjecture, given the 
small region of growth rates in which Jv4(v(DM)) < 0, 
that .7v4(v(DM)) > for all reversible chains regardless 
of the signs of their eigenvalues. Cyclic M, on the other 
hand, always produce a negative post-dispersal fitness- 
abundance covariance, Ji(MD) (Theorem [6]). 

When the growth rate of an environment is increased, 
then its stationary proportion of the population increases 



of the abundance relationships before and after dispersal 
can provide information about the extremity of the envi- 



when the census is just prior to dispersal (Corollary 11). 
When the population is censused just after dispersal, 
the relationship can be reversed, as |McNamara and Dall| 
(2011) discovered, by extreme patterns of environmental 
change. Thus, we have a novel implication, for popula- 
tions near their stationary distribution, that comparison 



ronmental change pattern (Corollary 13 ) 



A number of results are obtained for a generalization 



of the |McNamara and Dall (2011) model to n environ- 
ments (|7|. Corollary |14|finds , just as |McNamara and Dall 



do for two environments, that the post-dispersal fitness- 



abundance covariance, J^(MD) (which McNamara and 



Dall call the "multiplier effect"), is positive exactly when 
the reduction principle operates — i.e. when the growth 
rate of the population increases from reduced uncondi- 
tional dispersal. It is negative when there are departures 
from reduction. This correspondence between a negative 
fitness-abundance covariance and departures from the re- 
duction phenomenon is, however, specific to the model of 
|McNamara and Dall| and not a general property of depar- 
tures from reduction for operators of the form MD. 

The field ecologist would want to know how feasible it is 
to measure the fitness-abundance covariance. Recall that 
M can represent a variety of processes. When M repre- 
sents the dispersal probabilities between patches, then Vi 
represents the portion of the population in patch i, and 
the quantity tti represents the portion that patch i would 
have in the absence of differential growth rates. Thus Wi is 
not something that actually exists but is a counterfactual. 
It may be feasible, however, to estimate ir by estimating 
M from a measurement of the amount of dispersal be- 
tween each patch (e.g. through mark and recapture exper- 
iments) , and computing the Perron vector of the resulting 
estimated M. 

M has a different meaning in the model of |McNamara| 
|and Dall] where it represent the Markov chain that the en- 
vironmental states independently follow in all the patches, 
and iTi is simply the portion of patches that are in envi- 
ronmental state i, while vi is the portion of the population 
in patches of environmental state i. Each of these is an 
actual quantity that is potentially measurable. 

The expression r-f 1 + 



from 



McNamara and Dall 



(2011) is seen in the general case to be a part of the 



harmonic mean of the expected durations of states in a 
Markov chain. The harmonic mean is shown in Lemma 
[T5| to be a simple function of the sum of the eigenvalues 
of the chain's transition matrix. Thus the condition on 

is really a 



discovered by 



McNamara and Dall 



condition on the eigenvalues of the environment transition 
matrix. 

In Theorem [16] these three entities — the reduction 
phenomenon, the harmonic mean of environment dura- 
tions, and the eigenvalues of the environment transition 
matrix — are tied together in the case of environmental 
change processes that are reversible Markov chains. A 
sufficient condition for departures from reduction (selec- 
tion for increased unconditional dispersal) is that all of 
the non-Perron eigenvalues of the environment transition 
matrix be negative, which represents an extreme pattern 
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of change, in which the harmonic mean of environment 
durations is less than 1 + 1/ (n — 1), where n is the num- 
ber of environments. This means the environment changes 
almost every generation. 

This departure from reduction identified by |McNamara| 
and Dall and generalized in Theorem [16] provides a new 
example summarized by the "principle of partial control" 
(Altenberg 1984). The 'partial control' in Theorem 16 is 



that while the organism can control the transformation of 
its location (i.e. dispersal), it cannot control the transfor- 
mations that change its environment. 

Theorem 1161 shows that a sufficient condition for the re- 
duction phenomenon (selection for reduced unconditional 
dispersal) is that all eigenvalues of the environment tran- 
sition matrix be positive, corresponding to less extreme 
environmental change. A general treatment of the inter- 
mediate case — of mixed positive and negative eigenvalues 



— remains an open question. But Corollary 19 shows that 
there is always some intermediate level of environmental 
change below which the reduction principle operates. The 
reduction principle would be expected to operate for more 
common patterns of environmental change. 



Theorem 20 shows that this complexity of behavior dis- 



appears when the process of environmental change does 
not have any causal connection between the identity of 
sequential environments. In this case, only the reduction 
principle operates. 

McNamara and Dallf s model of conditional dispersal is 



here generalized to arbitrary numbers of environments. 
Theorem [21] shows that conditional dispersal provides an- 
other situation where we observe departures from the re- 
duction principle. Conditional dispersal is mathematically 
analogous to directed mutation. Theorem 21 finds that 



there is always some environment from which it pays to 
increase dispersal, provided that there is: 1) some level of 
environmental change, 2) a causal connection between the 
current and next environments, and 3) different growth 
rates among environments. Therefore, philopatry is not 
the global evolutionarily stable state. This result holds 
for arbitrary environmental change Markov chains, and 
holds whether or not unconditional dispersal follows the 
reduction principle. 

This seems to contradict the conclusion of McNamara 
and Dall (2011) that there are "conditions under which 



reliable, cost-free cues to habitat quality, which might in- 
tuitively influence optimal dispersal decisions, should be 
ignored in favour of blind natal philopatry." This con- 
tradiction is resolved by examining the complete adaptive 
landscape for the conditional dispersal rates (the n = 2 
case in Figure [T]) . 

We see that the evolutionarily stable state of dispersal is 
highly sensitive to any genetic or phenotypic constraints 
placed on the range of dispersal combinations. The hy- 
pothesized error rate for environmental cues in the |Mc-| 
Nam ara and PaUl model can constrain the variation to a 



region where the population growth rate is maximized by 
philopatry. But a slight decrease in the error rate can shift 
the evolutionarily stable state to maximize conditional dis- 
persal from one environment, as shown by |McNamara and] 
[Dim 

Other patterns of phenotypic constraint can be envi- 
sioned, and the sensitivity of the ESS in this model to 
phenotypic constraints leads to a diversity of potential 
phenomena: intermediate ESS states, bimodal states, or 
a general condition of evolutionary volatility. The evolu- 
tionary outcome becomes highly dependent on the varia- 
tional properties (Altenberg 1995[) of t he organism. To 
the extent that the lMcNamara and Dalll model of random 
environments applies to the real world, the results suggest 
that empirical studies of the evolution of dispersal should 
find volatile relationships between an organism's dispersal 
behavior, the variational properties of its dispersal pheno- 
type, and the pattern of environmental change its lineage 
has experienced. 

8.1 Mathematical Methods 

The second set of take-home messages from this paper re- 
gards the mathematical methods. The primary message 
is that techniques from the Reduction Principle literature 
and contemporary linear algebra allow one to obtain ana- 
lytical results in greater generality than is often pursued. 
The common restriction to 2 x 2 matrices can be dropped 
for many results. 

There is the added benefit from generalizing 2x2 mod- 
els to the n x n case, which is that one is forced to see 
beyond the four particular entries of the 2x2 matrices 
to their deeper underlying structures, in particular their 
eigenvalues and eigenvectors, covariances, and the varia- 
tional structure of the matrices. In the case of McNamara 



and Dall (2011 online Appendix A, Theorem A), the set 
of inequalities on the particular vector elements can be 
unified by a single inequality on a covariance expression, 
as in Corollary [14] It is hoped that the tractability of 
many results for general n, and the insights provided from 
such results, will encourage this approach more widely. 

Tractability for Theorems [5j [16) |0] and [23) and Corol- 
laries 17 18 and 19 requires the assumption that the en- 
vironments form a reversible Markov chain. The transi- 
tion matrices of reversible Markov chains are synonymous 
with symmetrizable stochastic matrices. The tractability 
provided by symmetrizable stochastic matrices is the key 



tool adopted from Karlin's Theorem 5.1 (1982) and Fried- 



land and Karlin (1975, Theorem 4.1). Karlin's Theorem 



5.1 appears to have never been used since its publication 
until it was applied to the analysis of the evolution of 



mutation rates at multiple loci in ( Altenberg 2009a I . Re- 



cently, however, symmetrizable stochastic matrices have 



been used by Schreiber and Li (2011) to analyze the evo- 



lution of dispersal in cyclic environments. 

The environmental cycling that produces a departure 
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from the reduction principle in the model of Schreiber and phenomenological structure 



Li (2011 1 satisfies the same condition of extreme environ- 
mental change as in Theorem [l6j and the matrices are 
symmetrizable as well. But it is fundamentally a differ- 
ent model in that the environments change synchronously 
throughout all the patches, not independently as in 
so it is represented by z(t + 2) = MD (2) MD (1) z(0). 
Nevertheless, the parallels in its behavior with that of the 
McNamara and Dall| model are intriguing. Recall that 



Karlin 



( 1982 ) represented periodic environments by using 



cyclic matrices, so the model of Schreiber and Li (2011 1 
can be represented as 





M(m) 



M(m) 




D (D 
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We note that this has the form (1 — m)A + mB Q from 
the open problem posed in Altenberg ( |2004 ), and there- 
fore provides another set of conditions on A and B that 
produce departures from reduction. 

The case of general Markov chains remains an open 
problem for the above results. The principle difference 
when considering general Markov chains is that the non- 
Perron eigenvalues may come in complex-conjugate pairs, 
which represent cycles of states that are more probable in 
one direction than the reverse. Whether directional cycles 
of the environments can produce any new phenomena for 
the evolution of dispersal is here an open question. 

8.2 Conclusions 



Andrewartha ( 1961 1 classically defined ecology as "the sci- 
entific study of the distribution and abundance of organ- 
isms." In this respect, the fitness-abundance covariance 
investigated here is a basic quantity for ecology. 

What makes its behavior more complex than intuition 
would suggest is that differential growth rates between 
patches or environments can interact with the multitude of 
possible dispersal, environmental change, and other mix- 
ing processes to produce novel relationships. The relation- 



ships identified by McNamara and Dall (20111 between 



the fitness-abundance covariance, the temporal properties 
of environmental change, and selection for or against dis- 
persal provided the motivation for the present study. 

The goal here has been to pursue the mathematics un- 
derlying these relationships. In so doing, these relation- 
ships are shown to connect to the body of work in the 
population genetics literature on the Reduction Principle 
for the evolution of genetic systems and migration, and 
provide new examples of departure from reduction. The 
common mathematics underlying all of these models may 
lead to the eventual development of a unified theoretical 
treatment in which the different ecological and evolution- 
ary phenomena are seen as different aspects of a single 
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